Introduction

Arka River is the most important inland water resource for agricultural needs in the Akkar flatland, located in North Lebanon. Known for its wide-ranging agricultural activities, the Akkar agricultural area, equivalent to 15 and 14% of the total Lebanese usable agricultural areas and irrigated agricultural areas, respectively (ECODIT Liban 2015), constitutes the main supply for socioeconomic development in the region. However, excessive urbanization, growing industrial and agricultural activities, and excessive market demand have caused an over-exploitation of water resources. Consequently, their quality is more and more threatened by numerous pollutants from both natural and anthropogenic sources, raising doubts about their availability and impairing their proper use for irrigation purposes (Varol et al. 2012).

In Lebanon, following the rapidly growing awareness about surface water quality and its impacts on public health, reliable assessments are more and more required to guarantee better water management systems (Daou et al. 2013; Razmkhah et al. 2010). In order to deal with this situation and to promote sustainable agriculture, the quality of water used for agricultural purposes needs to be closely monitored (Haddad and Lindner 2001). This assessment is generally carried out by a follow-up of water quality, through the measurement of several physicochemical and microbiological parameters, in different control locations and at periodic times (Razmkhah et al. 2010). This kind of studies can provide useful information on irrigation water and thus enables a proper evaluation of its quality deterioration and its implication on agricultural productivity (Mohammad Rusan et al. 2007; Rattan et al. 2005; Xu et al. 2010). On the other hand, the impacts of agricultural practices and/or climatic changes on water quality are to be evaluated as well (Sun et al. 2015; Wang et al. 2013).

Moreover, and in order to avoid water scarcity and encounter water pollution from different human activities, the Food and Agriculture Organization (FAO) has recently developed quality guidelines for irrigation water. Its ambitions went further by promoting the reuse of “treated water” for irrigation purposes, enabling a sustainable development for the Middle East region and more particularly for Lebanon (Dimitri 2010).

As mentioned before, surface water quality assessments often include measurements of several parameters (physical, chemical, and biological). Furthermore, samples are collected at different sampling locations and during multiple time periods. This spatiotemporal analysis generates large and complex data sets, whose interpretation requires the use of multivariate data mining techniques (Gazzaz et al. 2012; Olsen et al. 2012). Numerous chemometric tools were used in this field, such as discriminant function analysis (DFA), cluster analysis (CA), and principal factor analysis (PFA). (Gazzaz et al. 2012; Wang et al. 2013). However, the most established technique is principal component analysis (PCA) (Fan et al. 2010; Olsen et al. 2012; Ouyang 2005; Razmkhah et al. 2010; Walker et al. 2015; Zhao et al. 2012). Therefore, it was chosen to be employed in the present work.

In the context of what has been said, the work presented in this paper aims to assess the variability of water quality in the Arka River, located in the Lebanese region of Akkar, during the course of the season. To our knowledge, few studies have been performed on this river since it is located far from central Lebanon. Due to this lack of information, several water sampling campaigns were conducted, followed by chemical and microbiological assessments. A multidimensional approach, using PCA, was carried out on the resulting data set. Applying this chemometric tool enables a reliable and pertinent interpretation of the results allowing to highlight discriminations between samples as well as to identify parameters contributing to those discriminations. This survey analysis will hopefully help to prioritize the decisions for the purpose of improving surface water quality management in the Akkar region, especially regarding its use for irrigation.

Materials and methods

Study area

The Arka River is located in the Akkar governorate, far north of Lebanon. The Akkar region covers a total surface area of approximately 788 km2 (OCHA 2014), with the Mediterranean Sea on its western boarders, Hermel-Bekaa to the east, Syria to the north and “El Bared” River to the south. It consists of nearly 283 communities, with a population estimated to approximately 253,000 inhabitants (OCHA 2014). The Arka River originates at “Al Fawwar” spring (Rahbe, Akkar, Lebanon) and crosses 25 km, before terminating at the Mediterranean Sea, 20 km north of Tripoli (Fig. 1). This river serves as the main water supply for irrigation purposes in the Akkar plain.

Fig. 1
figure 1

Map of the study area with the locations of water quality monitoring sites (1: Al Fawwar; 2: Deir Ouza; 3: Al Khale; 4: Ilate; 5: Bakarzla; 6: Arka; 7: Sahel; 8: Outlet)

Sampling locations

In the present study, eight sampling locations were selected along the river (Fig. 1). Their coordinates are gathered in Table 1. Locations for surface water samplings were carefully selected in order to obtain representative sites throughout the river, including both reference and potentially contaminated areas. Water samples were collected at quasi-monthly intervals between March and July 2014 (3 and 17 March, 7 and 28 April, 5 and 23 June, and 15 July), giving a total of seven samples for each location. Sampling, preservation, and transportation of water samples were performed at 4 °C and without any delays, according to standard method NF T90–100.

Table 1 Locations of sampling sites along the Arka River

Analytical procedures

Analytical methods used for water quality assessment were ISO, AOAC, and NF standard methods. An overall of 19 physicochemical parameters were monitored. The water quality parameters and the corresponding analytical methods applied for their determination are listed in Table 2. All spectrophotometric determinations were done using a Thermo Scientific™ Helios Beta spectrophotometer. A TOC-L Shimadzu analyzer was used for the determination of the total organic carbon (TOC) measurements. Analytical data quality was guaranteed through the implementation of laboratory quality assurance and quality control methods including the use of certified reference materials for internal control and the frequent participation to proficiency tests organized by international organizations; FAPAS for physicochemical parameters and Eurofins for microbiological parameters. Total germs (Tot germs 22 and 37 °C), total coliforms (Tot colif), and thermotolerant coliforms (Fec colif), were enumerated by membrane filtration technique according to ISO standard methods, ISO 6222:1999, ISO 9308-1:2000, respectively. Samples were filtered through 0.45 μm nitrocellulose membrane and transferred to adequate media for subsequent identification. Meteorological data for the Akkar region were recorded by the Lebanese Agricultural Research Institute (LARI) station based on Abdeh, Akkar, Lebanon and served for ultimate result interpretations.

Table 2 Water quality physicochemical parameters and analytical methods used for their determination

Multivariate data analysis: principal component analysis

Among multivariate analysis techniques, PCA is one of the most frequently used techniques for surface water quality monitoring. PCA is an unsupervised data analysis tool, oriented towards reducing the dimensionality of the data by calculating a set of new orthogonal axes or variables known as principal components (PCs) and which are linear combinations of the original variables (Shin et al. 2010). Considering a set of n objects described by p variables, the data may be gathered in a matrix X with n rows and p columns, where a line of X corresponds to an analytical curve (spectrum, chromatogram, etc.) or to a set of physicochemical measurements obtained from an analysis of a real sample, and a column of X to a variable. Briefly, PCA produces orthogonal components by decomposing the initial data matrix X into a matrix product T.PT (the “T” in PT means “transposed matrix”). The T matrix is commonly called the “scores” matrix. The matrix P, called the “loadings” matrix, shows which variables are responsible for patterns found in scores T (Cordella et al. 2012). In this study, the data matrix X had the following dimensions: n = 64 and p = 23, where n corresponded to the 7 water samples collected at the 8 selected locations, added to them 8 rows corresponding to the means for each of the 8 locations, giving a total of 64 rows; and p to the 23 physicochemical and microbiological monitored parameters. PCA was performed on data matrix X after column standardization (mean centering of each variable and dividing the resulting values by the column’s standard deviation). Data treatment was done using MATLAB version R2011a (The MathWorks, Natick, USA).

Specific ultraviolet absorption indices

The specific ultraviolet absorbance (SUVA) index is defined as the UV absorbance (m−1) at 254 nm divided by dissolved organic carbon (DOC) concentration (mg L−1). In our case, SUVAs of the different samples were obtained by dividing the UV absorbance at 254 nm by the TOC (mg L−1). In fact, since the Lebanese surface water is known to be an oligotrophic medium, TOC was considered equivalent to DOC. Moreover, and for all samples, the quasi-majority of organic carbon was found to be under its dissolved state (TOCs were quasi-equal to DOCs).

Results and discussion

Principal component analysis

In a first step, PCA was performed on the data matrix X (64 × 23), in order to investigate spatiotemporal discriminations between samples according to the selected 23 parameters. A standardized data matrix was considered. This scaling step was performed in order to avoid incorrect discriminations which could be due to wide differences in data dimensionality. Kaiser-Meyer-Olkin (KMO) test for sampling adequacy and Bartlett’s test of sphericity were performed in order to test the suitability of the data to PCA (Wang et al. 2013). In both cases, data was proven to be suitable for PCA (KMO = 0.68 and significance level = 0, indicating the presence of significant relationships among the variables) (Shrestha and Kazama 2007).

The PCA results showed that, of the 23 calculated components, the first component (PC1) accounted for 40%, and the second component (PC2) for 14% of the total variance. These two components together explained about 54% of the total variance and were thus selected for the following discussions. The rest of the 23 components explained individually very low percentages. Therefore, they were judged as insignificant for our study.

From this point forward, and for all represented PCA results, each PC plot will consist of two figures. In the upper one, the ‘loadings’, which denotes the contributions of the 23 original variables to the various components, are represented. The lower figure corresponds to the ‘scores’ plot which will show the potential spatiotemporal discriminations of each sample or group of samples. It should be noted that the first type of scores (red circles) corresponds to the mean of scores of the 7 sampling dates for each location. This removes the temporal factor and helps understand the spatial evolution more clearly.

Figure 2a shows the scores of all 56 samples and the loadings corresponding to the 23 parameters on PC1. This first component is mainly characterized by mineral parameters (EC, Mg, Na, K, Cl, NO2 and B). Moreover, NH4 and SO4 have also higher loadings than other parameters. When looking at the scores of the 56 samples on PC1, it can be clearly noticed that all samples have scores around 0 except the one corresponding to the location number 8 (see Fig. 1), sampled on 5 June, which is clearly discriminated from all other samples. At the first place, it was sought that this sample could probably correspond to an outlier. However, the superposition of this result with the meteorological data for the Akkar region (Fig. 2b), shows that during the period from 3 to 5 June 2014, a storm took place, along with strong winds, precisely on 5 June. These conditions have contributed to the loading of surface water collected from site 8 (outlet) by mineral components, due to the intrusion of seawater towards the inland. Therefore, its discrimination was so evident according to the mineral component.

Fig. 2
figure 2

a Scores of the 56 samples along the Arka River and loadings of the 23 studied parameters on PC1 (sampling dates: 1: 3 March; 2: 17 March; 3: 7 April; 4: 28 April; 5: 5 June; 6: 23 June; 7: 15 July 2014). b Meteorological data for the studied area as recorded by the Lebanese Agricultural Research Institute (LARI), Abdeh station, Akkar, Lebanon

The second component (PC2) (Fig. 3) represents mainly an organic or even a bacteriological character, comprising higher loadings for TOC, total germs, total coliforms, and fecal coliforms. When considering the scores of surface water samples, it can also be noticed that one sample, corresponding to the site number 5 (Bakarzla) sampled on 15 July, is clearly discriminated from all other samples. This discrimination could be due to a punctual environmental violation, like an illegal wastewater discharge occurring before this sampling period.

Fig. 3
figure 3

Scores of the 56 samples along the Arka River and loadings of the 23 studied parameters on PC2 (sampling dates: 1: 3 March; 2: 17 March; 3: 7 April; 4: 28 April; 5: 5 June; 6: 23 June; 7: 15 July 2014)

According to what have been presented above, it is clear that more specific information regarding the spatiotemporal quality of surface water is masked by the highly discriminated two samples. Therefore, these samples were removed from the data set, and PCA was repeated on the reduced matrix (62 × 23), in order to highlight new potential discriminations. The first three PCs were chosen, explaining around 47% of the total variance.

Looking at the scores on the new PC1 (Fig. 4), an increasing trend can be noticed on a spatial level; meaning that the scores are increasing when going down from the spring to the outlet. This evolution could be clearly highlighted on the scores corresponding to the means (red circles). According to the corresponding loadings, natural mineral parameters are responsible for this trend, mainly EC, Na, K, Cl, and Fe, but also organic parameters like NO3, NO2, COD, and TOC (loadings >0.2). In fact, this evolution represents a classical trend of rivers like Arka River. Levels of mineral components increase when going down from the spring to the outlet due to soil leaching and to the intrusion of seawater constituents (El Moujabber et al. 2006; Wilbers et al. 2014). However, high levels of NO3, NO2, COD, and TOC are indicator of an organic pollution, especially for stations 6, 7, and 8 passing through the Akkar agricultural flatland. In fact, the presence of nitrate could originate from agricultural fields where fertilizers are frequently used (Razmkhah et al. 2010). Nitrite may also result from the reduction of nitrate in the presence of high levels of organic matter (Razmkhah et al. 2010), which could explain the implication of these parameters together in this PC. Moreover, a temporal evolution can also be noticed, especially when considering locations 6, 7, and 8, which are the nearest to the outlet. In fact, scores of these sites increase till sampling date 5 (5 June). This increase could be explained by a certain concentration effect of mineral and anthropogenic parameters due to less precipitations and higher temperatures, when approaching the summer season in Lebanon. For sampling dates 6 (23 June) and 7 (15 July), it could be noticed that this temporal evolution is not continuous. This observation could be explained by the following; in fact, during the last 2 months (dry season) water was no more available in the river, since what was remaining was completely used for irrigation. Therefore, and in order to overcome this missing data, matrix lines corresponding to these samples were filled by mean values of each parameter. As a result, scores did not follow the increasing trend and were thus not considered in the interpretation.

Fig. 4
figure 4

Scores of the 54 samples along the Arka River and loadings of the 23 studied parameters on PC1 (sampling dates: 1: 3 March; 2: 17 March; 3: 7 April; 4: 28 April; 5: 5 June; 6: 23 June; 7: 15 July 2014)

The scores obtained on the new PC2 (Fig. 5) also show a specific trend. Scores start with high values (locations 1 and 2), then follow a decreasing trend till location 6 where they start to increase again till location 8. The major parameters involved in this inverted bell curve trend are the following: Mg, Fe, COD, Abs254, and fecal coliforms (loadings ≥0.2). Moreover, and on a temporal level, it can be also noticed that this evolution is more intense from sampling date 4 (2 April). PC3 (Fig. 6) shows an opposite pattern to that observed in PC2 (Fig. 5), since scores start low for locations 1 and 2, increase for locations 4, 5, and 6 and then start to decrease when reaching location 8. The major parameters implicated are as follows: pH, KMnO4, Abs254, B, total germs, total coliforms, and fecal coliforms (loadings >0.2). These two complementary trends indicate that stations starting from station 3 and especially those located in the Akkar flatland (stations 5, 6 and 7), are encountering organic and bacterial pollution probably due to anthropogenic stress coming from the Akkar flatland and the surrounding villages.

Fig. 5
figure 5

Scores of the 54 samples along the Arka River and loadings of the 23 studied parameters on PC2 (sampling dates: 1: 3 March; 2: 17 March; 3: 7 April; 4: 28 April; 5: 5 June; 6: 23 June; 7: 15 July 2014)

Fig. 6
figure 6

Scores of the 54 samples along the Arka River and loadings of the 23 studied parameters on PC3 (sampling dates: 1: 3 March; 2: 17 March; 3: 7 April; 4: 28 April; 5: 5 June; 6: 23 June; 7: 15 July 2014)

Specific ultraviolet absorption indices

The evolution of SUVA values for all surface water samples are shown in the box plot presentation in Fig. 7. SUVA values are presented as a function of each sampling location during the season. SUVA was proved to be well correlated with the aromatic content of natural organic material (NOM) dissolved into water (Bazrafshan et al. 2012). In fact, aromatic and humic substances represent higher absorption at 254 nm than aliphatic and non-humic substances (Hua et al. 2015). Moreover, SUVA values of a water have been found to be a good surrogate for hydrophobic, aromatic, and high molecular weight (MW) NOM fractions, such as humic and fulvic acids (Ates et al. 2007). Therefore, higher SUVA values provide a preliminary indication that organic compounds originate from natural humic substances, where lower SUVA values point to an anthropogenic non-humic source (Bazrafshan et al. 2012). Looking at the box plot of Fig. 7, it can be noticed that higher SUVA values are obtained for stations 1 and 2. These values decrease and oscillate around the reference value of 4 L mg−1 m−1 for all remaining stations. This observation may be explained by the fact that stations 1 and 2 are the nearest to the spring; thus, all organic matter originates from natural sources. However, starting from station 3 and for stations across the Akkar flatland, organic matter originates mainly from anthropogenic sources due to human activities. These results perfectly superimpose those obtained in PC2 (Fig. 5) and PC3 (Fig. 6), where the same effect is highlighted.

Fig. 7
figure 7

Evolution of Specific UV Absorption (SUVA) for the eight sampling sites along the Arka River

Conclusion

This study was undertaken in order to monitor the water quality in the Arka River, which constitutes the main inland water resource for human and agricultural needs in the Akkar region, located north of Lebanon. For this purpose, 19 physicochemical parameters and 4 microbiological ones were assessed for 8 sampling locations along the studied river and for 7 sampling campaigns between March and July 2014. PCA was used to evaluate the spatial and temporal variations in surface water quality, being one of the most established multivariate analysis techniques in this field. In a first step, two PCs were considered, one characterized by mineral parameters and the second, by organic and bacteriological parameters. Two particular samples were discriminated due to punctual meteorological and anthropogenic breakthroughs. In order to extract further meaningful discriminations, these two samples were removed and a second PCA was performed on the reduced matrix. The new PCA results highlighted several spatial and temporal trends and helped extract and identify the factors responsible for variations in water quality. Parameters responsible for these variations were related to physicochemical parameters originating from soil leaching and intrusion of seawater but also to organic and bacteriological parameters indicating an anthropogenic pollution especially for stations located in the Akkar flatland. SUVA results validated these findings and showed that indices decreased while going away from the source, indicating a more pronounced anthropogenic character for the dissolved organic matter. As a conclusion, this study allowed a clear assessment of surface water quality of the Arka River which will hopefully lead to identify and control pollution sources coming mainly from the flatland and the surrounding villages. Therefore, an effective surface water quality management system may be established enabling a proper use of water for irrigation purposes.