INTRODUCTION

Monitoring and assessment of water quality is a crucial sustainability issue for surface water, especially for rivers. The deterioration of river water quality can result from natural processes and more recently due to anthropogenic activities through the discharge of industrial, mining and domestic wastewater as well as agricultural drainage to the rivers [27]. The common way of the water quality evaluation is the assessment of analyzed physical-chemical parameters with onset national or international limit values, which in itself does not give a hint about the source of contamination. To assist in the processing and analyzing these data that have been increasing over time, statistical methods have proved to be very adequate and perhaps the most used [6, 16]. In the scientific literature, different statistical techniques, including a Factor Analysis (FA), Pearson Correlation Index (PCI), Principal Component Analysis (PCA) and Confirmatory Factor Analysis (CFA) were used to assess temporal and spatial variations in river water quality and to identify potential sources of water contamination [9, 14, 17, 22, 24, 29, 31]. However, to the authors’ knowledge, only limited research on the effectiveness of multivariate models for the assessment and management of water quality has been conducted thus far [29, 31]. Several studies about water quality and sources of pollution in many Serbian rivers have been conducted over the last years [3, 12, 13, 19, 20], showing that their water quality is not satisfactory.

In this study we analyzed and interpret a large data set obtained during twenty years (1997–2016) of monitoring of the Sava River in Serbia. The paper considers the assessment of physical-chemical water quality (PCWQ) of the Sava River in accordance with the Water Law’s and Water Quality Classification Criteria of ICPDR [7], who are harmonized in Serbia. In study was analyzed physical-chemical data using statistical methods and modeling. This study utilizes analysis of moment structures (AMOS) to confirm the goodness of fit of the previous factor analysis model. The structural equation modeling (SEM) which has been carried out using IBM-SPSS Amos, allows simultaneous analysis of the entire system of parameters. When factor analyses and discriminant analyses are applied, they are conducted at the same time to obtain optimal results of statistical analyses. During the confirmation, we examine the hypothesized orthogonal results as well as utilize oblique rotation to explore the goodness of fit of the reflective indicators of the orthogonal rotation.

For this purpose, techniques were applied to (1) determine the similarities and differences between the values of selected PCWQ parameters on selected sampling stations, (2) evaluate the contribution of PCWQ parameters to temporal variations in surface water quality and (3) identify the type of pollution sources.

This study aims to build a standard set of methods that can be applied by the authorities in order to improve the application of statistical analysis of water quality data. The results are expected to help evaluate the spatial–temporal evolution of Sava River PCWQ and consequently enable managers to understand the main types of pollution sources in the river basin.

DATA AND METHODS

Study Area and Collection of Samples

River Sava is the second-longest tributary of the Danube, stretching 945 km along the south-eastern Europe. The Sava River Basin (SRB) is a major drainage basin in south-eastern Europe and one of the most significant sub-basins of the Danube River Basin (12% of the total Danube Basin area), extending across Slovenia, Croatia, Bosnia and Herzegovina, and Serbia with river basin about 97 000 km2 [8]. Regarding specific characteristics of waterways, river Sava can be divided into three characteristic parts: Upper Sava, Middle Sava and Lower Sava, according to classification by The International Sava River Basin Commission [8]. In this study we analyzed section of the Sava River in the Serbia in length of 207 km.

Database of Serbian Environmental Protection Agency [23] for a twenty-year period 1997–2016 was used. Parameters of PCWQ were measured at the three sample stations (SSt) on the Serbian part of the Sava River (Jamena, Šabac and Ostružnica) (Fig. 1). On each SSt ten parameters were analyzed: Temperature (T), pH, Electrical Conductivity (EC), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD5), Suspended Solids (SS), nitrate (NO3-N), nitrite (NO2-N), orthophosphates (PO4-P), and ammonium nitrogen (NH4-N). Measurements were performed on a monthly basis.

Fig. 1.
figure 1

Sava River with the position of sample stations.

Statistical Methods Used

The statistical techniques used for implementing necessary analyzes were included One-way analysis of variance ANOVA, PCI, PCA and CFA. All mathematical and statistical computations were made using SPSS 25.0. ANOVA with Post-hoc Scheffe test [13, 20], as one of the strictest and most often used tests, was applied for definition of significant difference between PCWQ variables (Table 1). The obtained results are presented through water quality classes defined based on ICPDR criteria [7]. PCI was applied to determine the relationships among the 10 PCWQ variables of water [21]. PCA is a multivariate statistical method for data reduction with minimum loss of original information [26]. She was applied for monitored data standardization, variables weighting and equation establishment. The each PCWQ parameter was standardized before PCA. Oblimin rotation technique was also used to find strong association among data sources and important factors to achieve better goals [15]. Kaiser-Meyer-Olkin (KMO) value >0.5 and Bartlett’s test of sphericity significance at p < 0.05 level are considered as the statistical significance of PCA results [2, 25]. Later, observing the Scree Plot and the PCs whose initial eigenvalues less than 1.0 are discarded to reduce dimensionality. The corresponding rotation sum of square loading describes the percentage of variance explained by the newly constructed factors [15]. This study applies factor analyses to identify factors. These factors are analyzed with AMOS modeling software. We apply CFA to analyze the goodness of fit, reliability, and validity of factors under factor analyses. CFA simulates the goodness of fit of the factor analysis model from the perspective of confirmation. Therefore, a p-value higher than 0.05 is defined as good [1, 31]. The model also requires that it should be confirmed with diverse indices for self-tests in order to understand the fairness of the internal quality of the model. The evaluation standard is shown in Table 2 [4].

Table 1.   Water Quality Classification Criteria of ICPDR [7]
Table 2.   SEM evaluation indices and standard for the goodness of fit of model (GFI—Goodness of fit index, AGFI— Adjusted goodness of fit index, SRMR—Standardizd root mean squared residual, RMSEA—Root mean square error of approximation, NFI—Bentler-Bonett normed fit index, RFI—Relative fit index, IFI—Bollen’s incremental fit index, TLI—Tucker–Lewis index, CFI—Comparative fit index)

RESULTS

Results of ANOVA

One-way ANOVA was performed to check the differences among parameters. The mean annual values (Fig. 2) of the PCWQ parameters T, pH and NO2-N were not significantly different on Sava River (p > 0.01 or p > 0.05). On the other hand, mean annual values of the EC, DO, BOD5, SS, NO3-N, NH4-N and PO4-P were significantly different (p < 0.01).

Fig. 2.
figure 2

Mean annual values of PCWQ parameters of the Sava River in the stream section in Serbia.

Parameter analysis by months showed significant statistical differences (p < 0.01 or p < 0.05) for parameters T, EC, DO, BOD5, SS, NO3-N and NO2-N (Table 3). The high mean values of pH, EC, BOD5 and NH4-N registered in the warmer period of the year, as well as lower values of DO. The high mean values of SS, NO3-N, NO2-N and PO4-P are registered in the colder period of the year. Given that, degrading values of some PCWQ parameters are registered during the colder period of the year, it can be concluded that the temperature, as a natural factor is not always dominant and that parameter values are also affected by anthropogenic influences. Parameter analysis showed significant statistical differences (p < 0.01) for parameters EC, BOD, SS and NO2-N (Table 4). The highest mean values of T, EC, BOD5 and NH4-N registered at SSt Ostružnica. The highest mean values of SS, NO3-‑N, NO2-N and PO4-P are registered at the SSt Jamena. These results indicate a significantly lower PCWQ at SSts Jamena and Ostružnica. The highest values of NO3-N were registered at SSt Jamena, and the value of this parameter decreases in the downstream part of the Sava River. The highest values of DO and better PCWQ registered at SSt Šabac (Table 4).

Table 3.   Mean monthly values of PCWQ parameters on the Sava River in the stream section in Serbia and ANOVA results
Table 4.   Mean values of PCWQ parameters for all SSt on the Sava River in the stream section in Serbia and ANOVA results

Results of PCI

Correlations between the PCWQ parameters were calculated by Pearson Correlation Index which is shown in Table 5. A moderate negative correlation has been noticed between the T and DO (r = ‒0.543; p < 0.01) and weak positive correlation between the EC and PO4-P (r = 0.326; p < 0.01). Based on the significant at the 0.01 level, very weak positive correlation exists between T and EC, between EC and NO3-N, between DO and SS, between BOD5 and NH4-N and between NO3-N and PO4-P. Moreover, we detect very weak negative correlation between T and SS and NO3-N, between pH and SS and between EC and SS (Table 5).

Table 5.   Pearson correlation between PCWQ parameters

Results of PCA

Correlation matrix revealed a number of coefficients of 0.5 and higher. The value of the Kaiser-Meyer-Oklin indicator is 0.605, which corresponds to the recommended value <0.5 [3]. Furthermore, Bartlett’s test of specificity reached statistical significance (p = 0.000), indicating the factability of the correlation matrix. PCA revealed the presence of four factors with values over 1, which explains 17.57, 14.71, 11.08 and 10.98% of the variance. The four-component solution explained a total of 54.33% of the variance. To help interpret these four factors, the oblimin rotation was conducted as well. Oblimin rotation revealed the existence of a simple structure, with all factors having high factor weights of individual parameters. The first factor contributes 17.57% to the overall variability and it has a high positive load for DO (0.852) and a high negative load for water T (‒0.834) (Table 6). The first factor can be termed as a “seasonal” factor (Fig. 3), because it affects the distribution and the living conditions of aquatic ecosystems as it can affect many physical, chemical and biological processes [11]. The second factor contributes 14.71% to the total variability and it is the most highly positive correlated with the parameters: EC (0.784) and PO4-P (0.772) (Table 6). The second factor can be referred to as the “eutrophication” factor (Fig. 3), because excess levels PO4-P in surface water cause eutrophication. The discharge of domestic and industrial water and the drainage of agricultural land fertilized contribute to the increase of the concentration. The third factor contributes 11.08% to the total variability and it is high positively correlated with parameters SS (0.769) and medium negatively correlated with parameters pH (‒0.622) (Table 5). The third factor can be referred to as the “sediment pollution” factor (Fig. 3), because the value of SS is closely linked to erosion and transport of nutrients (phosphorus, especially), metals, industrial waste, and chemicals used in agriculture transport [5]. Also, increased concentrations of SS could be explained by the presence in water, besides microorganisms, of colloidal materials such as silt, clay, organic and inorganic material into fine particles [30]. The fourth factor contributes 10.98% to the total variability and it is high positively correlated with parameters BOD5 (0.757) and medium positively correlated with parameters NH4-N (0.687) (Table 6). The fourth factor can be termed as the “organic pollution” factor (Fig. 3), because BOD5 refers to the decomposition of organic substances by bacteria under aerobic conditions and the amount of oxygen they consume during mineralization which is usually higher in the wastewaters [2]. Also, NH4-N originates from the nitrogen-containing organic material and gas exchange between the water and the atmosphere and from the biodegradation of waste and from domestic, agricultural and industrial inputs [2].

Table 6.   Factor score coefficient matrix and correlation of variables and factors for PCA with oblimin rotation for scale items PCWQ parameters (factor load value greater than 0.75—connection “high”; factor load value from 0.75–0.5—connection “medium”; factor load value less than 0.5—connection “low.” Source: Created by the authors based on data analysis in SPSS v25.0)
Fig. 3.
figure 3

Correlation of CPF1/CPF2/CPF3/CPF4 of investigated PCWQ parameters for section in Serbia.

Results of CFA

This study applies factor analyses to identify four factors related to seasonal, eutrophication, sediment pollution and organic pollution (Table 6). These four factors are analyzed with AMOS modeling software. The oblique rotation was used to analyze and modify the original orthogonal rotation to identify the correlation of the four factors because of the relationship of the ten parameters of PCWQ.

According to Fig. 4 the model fit indicates demonstrated that overall statistics for PCWQ model were satisfactory χ2 (14) = 35.870, p = 0.001, CMIN = 2.56, RMSEA = 0.047, SRMR = 5.5, NFI = 0.922, RFI = 0.843, IFI = 0.951, GFI = 0.987, AGFI = 0.968, TLI = 0.898, CFI = 0.949 (Table 7).

Fig. 4.
figure 4

Results of oblique rotation (factor analyses).

Table 7.   Evaluation indices of second specification search (oblique rotation; GFI—Goodness of fit index, AGFI— Adjusted goodness of fit index, SRMR—Standardizd root mean squared residual, RMSEA—Root mean square error of approximation, NFI—Bentler-Bonett normed fit index, RFI—Relative fit index, IFI—Bollen’s incremental fit index, TLI—Tucker–Lewis index, CFI—Comparative fit index)

The orthogonal model stands, given that the algorithm results show that the covariance curve is included in all four factors and no related issues are detected in the goodness of fit of reflective indicators.

From the analysis results of oblique rotation, there is a covariance connection between the four factors, and among them. The correlation coefficient between factor of seasonality and eutrophication is the highest. The explanation is excessive algae growth due to a change (increase) of temperature.

DISCUSSION

PCWQ of the Sava River was assessed through the analysis of ten PCWQ parameters. Water T and pH values on Sava River (Fig. 2; Tables 3, 4), for the 20-year period, are relatively uniform and don’t show statistically significant differences. According to Water Quality Classification Criteria of ICPDR, water T and pH on all SSt were in the I and II class water quality [7]. Measured pH values of this parameter are closer to the upper limit (6.5‒8.5). Base water in itself has several negative OH-ions, which can lead to higher availability of oxygen and thus better water quality as well as better self-purifying capacity. Slightly higher values of pH were registered during the summer months and the warmer period of the year. The results of PCI show negative correlation between pH and SS. Negative correlation between pH and SS was confirmed by the F3 of the PCA analysis (Table 6).

Measured values of EC (Fig. 2; Tables 3, 4) show the large difference in 2013 and 2004 when the lowest and highest values of this parameter are registered. Slightly higher values of EC are recorded in the period from July to November. The highest values of EC were recorded at SSt Ostružnica in 2004. None of the measured EC values exceeded the limit value for II class water quality (1000 μS/cm) [7] in any SSt. Since they are the EC values of surface waters is directly related to the concentration of dissolved solids in the water [28], the increase in EC values at the downstream station, could be linked to both agricultural and domestic human activities. As the water T increased seasonally, the amount of DO decreased but EC increased. The results of PCI analysis support this relation as there is positive correlation between EC and T data (Table 5).

Mean concentrations of DO in the study are qualifying the PCWQ of the Sava River good to excellent (Fig. 2; Table 3). The largest differences are noticed in 2014 and 1997 when the lowest and highest values of this parameter are registered. Significantly higher values of DO and better water quality were recorded in the winter period of the year. All measured DO values are above the limit value for class-II water quality [7]. The highest values of these parameters are registered at SSt Šabac (10.0 mg/L), while the lowest values are recorded at SSt Jamena (9.5 mg/L) (Table 4). In correlation matrix, it was found that DO was moderate negatively correlated to T (r = ‒0.543; p < 0.01) (Table 5). This was explained by the fact that increasing T reduces the dissolution of ambient DO in river water [10]. Also, low DO values in summer (Table 3) can be linked to higher water temperature and high activities of microorganisms requiring large amounts of oxygen for metabolizing activities and for organic matters degradation [32]. PCA analysis confirmed a negative correlation between DO and T by F1 (Table 6).

Measured values of BOD5 from all stations were in the class-I water quality according to ICPDR. As in many parameters, BOD5 have also increased with the significant differences between the all stations (p = 0.000, F = 11.710) (Table 4). The lowest values were registered at the SS Šabac (1.7 mg/L) and the highest at the SSt Ostružnica (2.1 mg/L). The greatest differences are observed between 2013 and 2002 (Fig. 2) when the lowest and highest values are registered. The values of BOD5 are higher in the warmer periods of the year (Table 3). PCI analysis shows that positive correlation exists between BOD5, T and NH4-N (Table 5). The results of PCA analysis (in F3) confirm the results of PCI (Table 6).

During the observed period, SS measured values on all SSt (Fig. 2; Tables 3, 4) did exceed the permitted limit value for class-II water quality [7] in several measurements. The greatest differences are observed between 2011 (7 mg/L) and 2015 (35 mg/L) when the lowest and highest values are registered. The values of SS are higher during winter and spring. The amount of SS is much lower in the down part of the Sava River, especially at the SSt Ostružnica, while the highest value is recorded on SSt Jamena (Table 4). This situation is caused by the shape of the Sava River Basin. The right side of the basin is in the limestone Dinaric Mountains, so the right tributaries are transporting a lot of eroded material which we registered nearby SSt Jamena. Additionally, organic pollution and the materials from the surface are carried to the creek bed by runoff. According to the results of the PCI analysis applied to the data, a weak positive correlation was found between SS and NO3-N which can indicate anthropogenic organic pollution. Negative correlation is observed between the SS, pH and EC, as shown by the results of PCA analysis (in F3) (Table 6).

Between 2013 and 1998 the greatest differences are observed in NO3-N values. The NO3-N values are higher during the winter compared to the rest of the year. All of the NO3-N samples were well below the level for class-II water quality [7], qualifying the PCWQ as excellent to good. The measured NO3-N concentrations showed negative correlation with T, and positive correlation with EC, DO, SS and PO4-P (Table 5), which indicate that higher NO3-N concentrations are likely due to the leaching from agricultural land.

The values of NO2-N are significantly higher during May, June and December (Table 3). Such results indicate a small influence of seasonality or natural factor and the existence of anthropogenic influence. The registered NO2-N values during the entire period of observation are not within the limits of the allowed limit values (<0.03) for class-II water quality [7], especially at SSt Jamena and SSt Ostružnica (Fig. 2).

Differences in PO4-P values are observed between 2006 and 1999 (Fig. 2). Slightly higher values of PO4-P were registered in the colder period of the year. The values of PO4-P are within the limits for class-II water quality and do not exceed the threshold value of 0.1 mg/L [7]. PO4-P is positively correlated with EC and NO3-N (Table 5) which indicates that PO4-P most likely originated from anthropogenic sources [2]. These results were confirmed by PCA analysis (Table 6), where PC2 can be termed as the “eutrophication factor.” Other researchers also provided similar results [18].

Concentration of NH4-N show greatest difference between 1997 and 2005 (Fig. 2). Slightly higher values of NH4-N were registered in the summer period of the year. The lowest values were registered at the SSt Šabac (0.075 mg/L) while the highest values were registered at the SSt Ostružnica (0.102 mg/L) (Table 4). All measurements during the observed period indicate that the measured values do not exceed the permitted limit value of 0.3 mg/L as N for class-II water quality [7]. PCI analysis shows positive correlation between NH4-N and BOD5 (Table 5). The results of PCI analysis confirmed the results of PCA in F4 (Table 6).

The SSt Šabac has a noticeably better PCWQ compared to the SSt Jamena and Ostružnica. After entering the Republic of Serbia, the Sava River receives its largest tributary, the Drina River, which is characterized by good PCWQ [13]. To a large extent, the Drina affects the dilution of the water of the Sava River and, consequently, the better PCWQ at the SSt Šabac. The worst PCWQ was recorded at the SSt Jamena. Upstream of this SSt, a larger urban settlement in Bosnia and Herzegovina (Brčko) is located, which has significant industrial sites and whose wastewater heavily influences the Sava River at the entrance to the Republic of Serbia. Downstream at the SSt Šabac, PCWQ is deteriorated again (at the SSt Ostružnica). The main reason is Obrenovac, industrial center with two thermal power plants (“Nikola Tesla A” and “Nikola Tesla B”). In addition to industry, the whole area has a very well developed agriculture, so the inadequate use of chemicals in agriculture is a significant source of water pollution in the Sava River as well. Point sources of pollution most often occur as a result of the discharge of untreated wastewater and leaching from agricultural land. These pressures are not large and do not significantly impair the PCWQ in these sections of the Sava River. However, since this situation may change over time, it is necessary to manage the river in an adequate and proper manner, respecting all norms and regulations and regular control of pollution parameters.

CONCLUSIONS

This study was conducted in order to analyze the PCWQ of the Sava River water and the types of pollution sources in part flow through Serbia. Looking at the water quality trend over the long term is important for the sustainability of river water and the surrounding ecosystem. Based on the obtained results, it can be concluded that the PCWQ of the Sava River in the Serbian part of the stream is satisfactory. The mean values of the parameters at all SSt were generally in the level of the allowed limit values for water quality of class II. Certain parameters, over the years, exceeded the permitted values (for example, increased NO2-N values at the SSt Jamena during 2004, 2009 and 2012 and at the SSt Ostružnica during 2002 and 2003), indicating the existence of certain sources of pollution. PCA identify the factors that affects PCWQ. Four factors that were greater than 1 and comprised 54.33% of the total variance. As shown in the algorithm results, as long as the covariance curve is included in the four factors, no related issues are detected in the goodness of fit of reflective indicators and interior and external quality is reported with excellence. The orthogonal model, thus, stands. The first factor (F1) showed that the influence of temperature as a natural factor is very significant and that it affects changes on the values of parameters and water quality, but it is not the only one. It has been established that certain pressures exist and are mostly expressed through the, so-called, point sources of pollution (elevated values of BOD5, NO3‑N, NO2-N, PO4-P, and NH4-N). This paper can serve as a reference for future research along the Sava River. In the future, other parameters that affect water quality can be included to improve water quality analyses and managerial completeness. Moreover, the paper showed the importance of statistical estimation of large datasets for obtaining better information on surface water quality, which provides a theoretical basis for environmental managers in making the most beneficial decisions.