Introduction

Water quality is directly related to its different uses, which include households, industries, agriculture, fisheries, ecosystems, hydropower, navigation, and recreation (Li et al. 2017a). River water quality can be assessed by analyzing its physical, chemical, and biological parameters, which vary based on location, time, weather, and presence of sources of pollution (Mustapha et al. 2013; Giri and Qiu 2016). Since rivers constitute the main inland water sources in Brazil, it is imperative to prevent and control river pollution and to have reliable information on the quality of water for effective management (Singh et al. 2005).

In fluvial ecosystems, water quality depends on the natural processes, such as precipitation rate, climatic conditions, and soil erosion, as well as the anthropogenic influences that affect its catchment area (Kazi et al. 2009; Álvarez-Cabria et al. 2016; Li et al. 2017b). Anthropogenic actions, such as urbanization processes, industrial, and agricultural activities, may lead to the deterioration of the water quality and the environmental conditions in river basins (Chen and Lu 2014; Calijuri et al. 2015).

It is necessary, then, to understand the relation between different land uses as well as spatial and seasonal variations in surface water quality to assess its tendency of change and effectively manage water resources (Ding et al. 2015; Wang et al. 2015). Such understanding can be facilitated by a monitoring program of the water body and analyzing the water quality parameters to identify and control those pollutants (Zhao et al. 2012). Reports of comparative studies in river basins impacted by urbanization, industrial, and agricultural activities highlight the importance of monitoring to prevent pollution and protect public health (Cukrov et al. 2012; Blasi et al. 2013).

The use of descriptive statistical analysis to interpret surface water quality has limitations, such as not detecting the long-term correlation between variables and inefficiently delimiting the source of the variation (Mustapha et al. 2013). Hence, recently, multivariate statistical techniques have widely been used as standard methods in the analysis of water quality data for obtaining meaningful information (Singh et al. 2005; Kazi et al. 2009; Li et al. 2018). The main techniques are the multi-way analysis of variance (ANOVA) and principal component analysis (PCA)/factor analysis (FA). These techniques also allow the identification of the possible factors/sources that are responsible for the variations in water quality and influence the water system and the apportionment of the sources (Li et al. 2013). This can be a valuable tool for developing appropriate strategies for effective management of water resources and rapid solution to pollution problems (Singh et al. 2005; Kazi et al. 2009).

Wang et al. (2013) applied PCA techniques in a study of water quality of the Songhua River in the Harbin region of China and concluded that they can be used to assess seasonal and spatial variations in water quality and identify latent sources of pollution. López-López et al. (2011) used the “Three Modes Principal Components Analysis” (3MPCA), a multivariate analysis technique, to estimate river pollution and identify pollution inputs in the Guadalquivir River estuary in Spain. They concluded that this technique is a useful tool to assess long-term tendencies of pollutants in water bodies affected by anthropic activities. Multivariate statistical techniques are very helpful for analyzing and interpreting sets of complex data, identify sources of pollutants, and understand variations in water quality for effective management of water resources (Zhao et al. 2012; Wang et al. 2013).

In the Agreste region of Northeast Brazil, which is the transition region between the semi-arid and the humid coast and characterized by a marked irregularity in the rainfall indices, various effects on the watersheds have, historically, been verified, especially regarding the water quality. The variability in the quality of the water bodies in these basins has been generally associated with changes in land use/land cover and the seasonality of precipitation. To verify this assumption, a monitoring process of water quality was carried out in a watershed representative of Agreste, Northeast of Brazil, generating a large data base with different interrelated parameters that permitted applying the multivariate analysis techniques for their interpretation.

Materials and methods

Characterization of the study area

The Siriri River basin (37° 12′ 52″W, 10° 24′20″S and 36° 54′ 22″W, 10° 45′44″S) is one of the main tributaries of the Japaratuba River, located in the State of Sergipe, in the Northeast region of Brazil (Fig. 1). Its drainage area is 429 km², with an annual average temperature of 25 °C and an average annual rainfall of 1,500 mm near its river mouth, and about 700 mm in its extreme northwest portion (Fig. 2b). The rainy season extends from May to August (Melo Neto et al. 2013).

Fig. 1
figure 1

Location of water sampling points in the Siriri River basin, in the State of Sergipe, Brazil

Fig. 2
figure 2

Land use/land cover in the Siriri River basin (a) and mean monthly precipitations recorded from October 2014 to September 2015 (b)

The main types of soils found in this watershed are Latosols, Neosols, Vertisols and Argisols, with the last one being the largest type. Land use is characterized by the predominance of agricultural activities (pastures and agricultural crops), the presence of thin forests, and urban areas (Fig. 2a).

Over the last 10 years, the Siriri River basin has undergone significant changes in its environmental quality, mainly in relation to its water resources, where erosion, riverbed sedimentation, water pollution, and changes in hydrological regimes of its rivers have been observed (Aragão et al. 2013). These impacts may be directly related to the intense changes in land use/land cover in the river basin, with significant removal of forest area and replacement of pasture by agriculture. Added to the problem is the low rate of domestic sewage connections in urban centers.

Water quality assessment in the Siriri River basin

The water quality monitoring in the Siriri River basin was performed monthly. The field measurements and samplings were carried out at five sampling points to comprise areas with distinct land use characteristics (Figs. 1, 2): point P1—predominance of sugarcane cropping; P2—low-density vegetation cover and large areas of pasture; P3—low-density native vegetation, pasture, and wetlands; P4—sugarcane cropland; and P5—urban areas. Normally, the sample collection started at 9 AM at P1 and ended around 2 PM at P5. The collection of water samples started in October 2014 and ended in September 2015, with 12 sampling campaigns.

Samples for physico-chemical and microbiological analysis were collected in three bottles (two polyethylene and one amber type), stored in ice at 4 °C and transported to the laboratory for analyses in less than 24 h. Parameters related to water quality were analyzed in the Laboratory of Water Analysis of the Technological Institute and Research of the State of Sergipe (ITPS), as per the methodologies recommended by the Standard Methods for Examination of Water and Wastewater (Rice et al. 2012). The investigation covered the following parameters:

  • Qualitative water temperature (Temp), pH, electrical conductivity (EC), dissolved oxygen (DO), turbidity (Turb), total nitrogen (TN), nitrate as N (NO3-N), total phosphorus (TP), total dissolved solids (TDS), chlorophyll-a (Chl-a), and thermotolerant coliforms (TCol).

  • Quantitative water level (WL), total agricultural area (AGR), total area of pasture (PAS), total precipitation per 1 day (Pr1d), 10 days (Pr10d), and 30 days prior to water sampling date (Pr30d).

Measurements of Temp, pH, EC, DO, NO3-N, Turb and TDS were carried out by means of a multiparameter probe, model Aquaread AP 2000.

To assess the environmental quality of the monitored waters, the maximum allowable concentration limits (MACLs) established for different classes of use by the Brazilian National Environmental Council (CONAMA) Resolution 357/2005 (Brazil 2005) for freshwater bodies were adopted (Table 1). The Brazilian regulations prescribe standards for 13 classes of water bodies and these standards have been used in the present study. It considers natural water bodies in three groups, namely freshwater, brackish water and saline water. River waters that come under the group of fresh water bodies can be grouped into one of the five classes termed as one special class and classes 1–4. The first four classes of water are considered appropriate for human consumption but requiring varying degrees of treatment from almost none to advanced treatment. Class 4 waters are considered suitable only for navigation and landscape uses. It should be emphasized that the Siriri River does not yet have a water use directive. Thus, it was decided to evaluate each of the parameters according to the MACL values for freshwater classes (Class 1, Class 2, and Class 3) established by the CONAMA Resolution 357/2005 (Table 1).

Table 1  Maximum allowable concentration limits (MACLs) established by CONAMA 357/2005 Resolution for the classification of freshwater quality

Average rainfall values were determined by applying the Thiessen polygon method for each area contributing to the sampling points, based on data provided by the National Institute of Space Research (INPE) and by the National Water Agency (ANA), considering the data from rain gages installed in the municipalities of Capela (ANA), Nossa Senhora das Dores (INPE), and Santo Amaro das Brotas (INPE). The average precipitation was determined for each sampling point for an antecedent period of one (1), ten (10), and thirty (30) days.

Agricultural class areas (AGR and PAS), existing within the area of contribution to each of the monitoring points, were determined based on the available land use/land cover classification for river basin, obtained from Aragão et al. (2013), supported by field verification, using the QGIS software.

Statistical, exploratory, and multivariate analyses were performed using software R version 3.2.2 (R Core Team 2016). Initially, data were tested for variance homogeneity and normality of distribution using Bartlett and Shapiro–Wilk tests. Data normalization was done using the logarithmic transformation of data and applying statistical analysis of variance and the non-parametric test by the Kruskal–Wallis method. For both methods, the significance level was 0.05. For these analyses, two factors were considered: (1) location of the monitoring point, which directly addressed the issue of land use/land cover, and (2) seasonal variation, considering the samples obtained during rainy wet and dry periods.

The application of principal component analysis (PCA) is aimed at eliminating the overlapping of information among variables thus indicating the potentialities of their contribution to the variability of the phenomenon in a more independent way. Principal component analysis provides information on the most relevant parameters, which describe the data set permitting data reduction with minimum loss of original information (Singh et al. 2005). First, the data were standardized before the correlation matrix for the variables was generated to eliminate problems with different dimensions and units of the variables. Subsequently, the PCA was applied to the correlation matrix of the data and the number of components was selected by the criterion of Jolliffe (2002). Improvements in the interpretation of the meaning of factors were obtained from the orthogonal rotation of the matrix of factorial loads using Varimax method (Pandiarajan et al. 2016).

Results and discussion

Exploratory analysis of the data

An exploratory analysis of the data obtained for the water quality assessment at the five locations in the Siriri River basin is presented in Table 2.

Table 2 Statistical descriptors of variables monitored in the Siriri river basin

In terms of variations along the basin, the mean values of water temperature (Temp) had a small variation, with an increase from upstream to downstream. However, this variation could also be due to the sampling times. Real changes in this variable create important changes in the geographic distribution of aquatic species and in the biogeochemistry of river ecosystems (Álvarez-Cabria et al. 2016), which was not observed during this period.

For pH value of the water, the CONAMA norms (BRAZIL 2005) establish a variation from 6.0 to 9.0 for freshwater classes 1, 2, and 3 (Table 1). Based on this, the observed mean values are well within the limits set by CONAMA. Among the five points monitored, point P1 (pH 6.84 mg L−1) had the lowest mean value of pH (Table 2). This corroborates the greater presence of nitrogen compounds in this sub-area, which may induce the production of organic acids and reduce the water pH value (Wigington et al. 1996).

The mean value of dissolved oxygen (DO) for the Siriri River water (6.29 mg L−1) met the standards established by CONAMA for Class 1 (DO ≥ 6 mg L−1) (Tables 1, 2). However, the mean values of DO in the waters at points P4 (5.55 mg L−1) and point P5 (5.21 mg L−1) were below that limit but met the standard limit established for Class 2 (DO ≥ 5 mg L−1). These low values are probably due to the presence of organic compounds caused using fertilizers in agricultural areas (P4) or due to the discharge of untreated domestic sewage into the river water (P5). It is also noted that there is a considerable variation in DO concentrations of the water, at each point throughout the year, represented by higher values of coefficient of variation, which may also indicate the influence of climatic seasonality in relation to irregularity in the distribution of monthly rainfall heights occurred during the analyzed period (Fig. 1b).

For total nitrogen (TN), Table 2 shows that the mean values of this variable are below the MACL values for Class 1 (3.7 mg L−1 TN for pH ≤ 7.5) (Table 1). The mean values of TN for the monitored points show higher concentrations in the waters at points P1 (2.76 mg L−1) and P5 (2.65 mg L−1), for which a greater variability was observed throughout the year. It could possibly be related to variation in the monthly rainfall distribution occurred during the study period and discharge of domestic effluents without treatment in river waters.

The mean values of the nitrate as N (NO3-N) (1.17 mg L−1) were well below the MACL values for Class 1 (up to 10 mg L−1) (Tables 1, 2). As for TN, a higher concentration was found in the waters at points P1 (1.49 mg L− 1) and P5 (1.28 mg L−1), with the water of this last point also showing very high nitrate concentration values. It is most likely due to releases of untreated effluents from domestic sewage near point P5 in the sampling period. The NO3-N mean values recorded were like those obtained in other studies of urban and rural river basins in Brazil, as reported by Madruga et al. (2008) and Carvalho et al. (2015).

As for total phosphorus (TP), the average value for all the five locations was 0.05 mg L−1 (Table 2), meeting the CONAMA standard for Class 2 (TP ≤ 0, 05 mg L−1) (Table 1). However, the highest concentration (0.12 mg L−1) observed at point P5 exceeds the MACL of TP (0.075 mg L−1) for Class 3. Thus, water at this point can be classified as Class 4 (TP ≥ 0.075 mg L−1) since the waters classified in this class are only intended for navigation and landscape harmony. It is almost certainly due to the effluent releases from the urban area present at this point and the variation of concentrations among the samples resulting from the degree of dilution in the river.

Considering chlorophyll-a (Chl-a), the MACL for Class 1 (up to10 µg L−1) was exceeded at points P2, P4, and P5 (Tables 1, 2). The mean value for this variable is 13.36 µg L−1, also exceeding that established for Class 1. This average value was high due to the large concentrations of effluent of the waters at points P2, P4, and P5. At point P5, the value was close to the limit for Class 2 (up to 30 µg L−1), with large variations among the samplings. This location is characterized by the continuous presence of organic matter, with high loads of TP, from domestic sewage, which is further aggravated by the fact that this is a river section where the flow is characterized by low velocity of flow, favoring the proliferation of algae and microorganisms (Silva et al. 2015).

The average value of turbidity (Turb) in the waters of the Siriri River basin was 15.23 NTU, meets the norm for Class 1 (Turb ≤ 40 NTU), with notable increase in the concentrations at points P3, P4, and P5 (Table 2). However, all values were below the MACL for Class 1 (Table 1). It was observed that there is an increase in the turbidity of the water as it advances downstream in the river basin, which may be attributed to greater erosion of the river banks, with the contribution of sediments to the flow.

The average value of total dissolved solids (TDS) for the waters of Siriri River for all the locations (160.57 mg L−1) was below the MACL for Class 1 waters (500 mg L−1) (Tables 1, 2). The TDS concentrations, as well as the Turb, increase in the downstream direction (Table 2). A similar behavior was noted for electrical conductivity (EC), which has direct correlation with TDS, not showing any excessive salt content.

The mean values of thermotolerant coliforms (TCol) indicate very high values at all sampling points (Table 2). Considering that the mean values exceeded the MACL for all water classes set by CONAMA (Table 1), the river water quality fails this norm. The most critical point is P5 (2.9 × 105 NMP 100 mL−1 on average) which has the highest concentration almost certainly because it is in an urban area with the presence of sewage detected during every field visit. The occurrence of high values of coefficients of variation for the waters at each point indicates a high fluctuation in TCol concentrations throughout the year. Carvalho et al. (2015) and Passig et al. (2015) also observed this behavior in the investigations in other Brazilian river basins with urban influence.

Multi-way analysis of variance

The results obtained by the application of multi-way ANOVA for the monitored parameters are shown in Table 3. For each variable in the table, the first line corresponds to the mean values and the second line where the values are shown within the parenthesis represents the standard deviation values.

Table 3 Multi-way analysis of the significance of location and seasonality factors with the physical, physico-chemical and biological variables

There are clear indications that the localization factor has a significant influence on pH, EC, TDS, DO, Turb, TCol, Chl-a, NO3-N, TN, TP and WL. At point P1, the pH value of the water was significantly lower than in the water of the other points; the water level (WL) at point P4 was significantly higher than at the other points of observation. The values of EC and TDS in the water, which have a direct relationship between them, increased from point P1 to point P5, that is, from head to mouth. These two points are the ones that showed the largest significant differences, highlighting the effect of the predominant land use, close to each point.

Concerning OD values in the waters of points P4 and P5, these were significantly lower than in the waters of the other observation sites (P1, P2 and P3), those were precisely locals where the values of Turb were significantly higher than in the others. Much of this behavior can be explained by the increase in the presence of domestic sewage in the river water along the course. This can also explain why the location of point P5 is a significant factor with respect to TCol not seen at other points in the watershed, as it is located within the urban area (Table 2). Turb level in water at point P1 was significantly lower than those at the other points (P3, P4, and P5), except for point P2.

The river had a higher presence of suspended sediments from point P2 to P5, with almost crystalline water at point P1 coming from the riparian forests still present in this region. TP concentration in the water at point P1 was just significantly lower than that of P5; this behavior can be due to the influence of the sewage from urban area present at this point. TN concentration in the river water was significantly influenced by localization factor, with the greatest values at points P1 and P5. The agricultural (P1) and urban (P5) uses effectively contribute to the higher nitrogen loads at these points (Table 3).

Therefore, topographic features play an important role in spatial behavior of water quality characteristics of the river basin. The slope angle, for example, is a major factor in the transfer of inorganic and organic compounds, such as nitrogen (NO3-N and TN) or phosphorus (TP), from the soil to the drainage channels (Villa et al. 2014; Álvarez-Cabria et al. 2016).

In the Siriri River basin, seasonality factor significantly influenced Temp, NO3-N, Chl-a, Turb, TN, TP, Pr1d, Pr10d, and Pr30d values (Table 3). Turb and TP are sensitive to the seasonality factor, with a significant increase in the wet period in the river basin, during which there is a more turbulent regime in the monitored points, with higher transportation and lower deposition of sediment (Bortoletto et al. 2015).

Seasonality factor influenced significantly TN, with increase in concentrations in the wet period. However, recent studies carried out in several river basins differ on whether seasonality factor does or does not have an influence on nutrient concentrations in rivers (Li et al. 2016; Santana et al. 2016; Yongqiu et al. 2016).

Seasonality factor also exerted significant influence on Temp, as expected, and on NO3-N, which was not influenced significantly by the location of the points. This behavior may reflect the transitory nature of the presence of nitrates in the water, constituting a stage of nitrification, which may hinder the adequate quantification in monthly samples.

The influence of localization and seasonality factors on Chl-a is noticed; however, it is worth highlighting the values recorded at point P5, where the values were very high, especially in the wet period (Table 3). Such behavior may result from an excessive nutrient loading by surface washing and sewage (Calijuri et al. 2015), followed by a sequence of sunny dry days, common in the region, contributing to the process of algae proliferation, as reported by Doherty et al. (2014) and Álvarez-Cabria et al. (2016).

For the physical parameters related to antecedent precipitations (Pr1d, Pr10d, and Pr30d), it was observed that they differed significantly with the seasonality factor, while WL showed significant differences only the locations with short fluctuations of water levels. However, it is noteworthy that during precipitation events, surface runoff increases and more sediment is transported to the water bodies (Calijuri et al. 2015). Álvarez-Cabria et al. (2016) suggest that the seasonal differences in water characteristics should be used to increase our understanding of how they affect water quality.

Principal component analysis

The decomposition of the variables into principal components and the five components that were rotated in factors by the Varimax method is shown in Table 4. The procedure accounted for 80% of the total variance.

Table 4 Factors of the matrix components transformed by the Varimax algorithm

According to Liu et al. (2003), factor loadings can be classified into three categories as “strong” (values greater than 0.75), “moderate” (0.75–0.50) and “weak” (0.50–0.30).

PC1, which explains 26% of total variance, has strong positive loadings on EC, TDS, and strong negative loading on DO. This component concentrates most of the effects of agricultural activities, which contribute to increasing EC and TDS concentrations, and reducing DO concentration. Although EC indicates natural pollution, soil erosion or weathering effects on water quality due to seasonal changes (Ogwueleka 2015), it may also be influenced by agricultural activities and have its value increased in some river waters. Thus, most of the spatial variability of water quality in the watershed may be associated with the condition of land use/land cover, which is supported by other studies (Deng et al. 2015; Calijuri et al. 2015; Giri and Qiu 2016). PC1 also has moderate positive loading on Temp, indicating some influence of seasonality in these parameter concentrations.

PC2 accounted for 20% of the total variance. Strong positive loadings were observed on nitrate and total nitrogen, and moderate negative loading on Temp. This component can be associated with nutrient parcel of water pollution, mainly with domestic sewage and fertilizers applied on crops in the basin. Loadings of both parameters (NO3-N and TN) are similar, indicating contribution from different sources in different time intervals, characterizing various stages in nitrogen cycle into river water (Vadde et al. 2018).

Strong positive loadings were observed on turbidity and total phosphorus in the third principal component (PC3). This component explains 13% of the total variance and concentrates sediment influence on the water quality, proving that TP is associated with soil particles transported by overland flows resulting in soil removal in the basin. According to Schilling et al. (2017), turbidity is a promising surrogate for TP concentrations mainly in agricultural watersheds, such as Siriri River basin.

PC4 accounted for 11% of the total variance, however, only water level (WL) had a loading above 0.75. This behavior may be a sign that the responses of this basin to rainfall events are too fast, and another sampling frequency would be necessary to assess the real influence of WL in water quality parameters.

The fifth component (PC5) explained 11% of the total variance and has strong positive loading for Chl-a and TCol. This component can be associated with microbial origin and can be related to sewage pollution in the river. Runoff from agricultural lands, farm animal feedlots and anthropogenic activities, such as direct discharge of fecal materials into rivers, also contaminate river water with pathogenic microorganisms (Dey et al. 2017). Fecal coliform bacteria concentrations tend to vary with season because of seasonal variations in climatic factors. This finding is corroborated by the results reported by Calijuri et al. (2015), which showed that the water quality of a watershed is directly influenced by the anthropogenic process, with the worst environmental conditions being found in localities close to urban areas and agricultural land.

Conclusions

Organic pollutants are predominant in waters of the Siriri river basin, mainly TN, NO3-N, and TP, which are normally related to agricultural activities and the presence of untreated urban sewage, contributing to the degradation of water quality. Greater concentration values in the downstream direction were observed, being potentialized when the river flow is reduced, especially in the dry season.

The influence of spatial variability on pH, EC, TDS, DO, TCol, Chl-a, Turb, TN, TP, and WL parameters was also detected. Seasonal variability on Temp, NO3-N, Chl-a, Turb, TN, and TP was also observed. These results may indicate that anthropogenic activities developed in the different watershed locations and in the different seasons contribute to water quality degradation, related to agriculture land use as also to urban land use.

Another important issue was the principal component analysis (PCA), highlighting the concentration of variables related to the influence of agricultural land use on component 1 (PC1), which results in 26% of the total variation of water quality in the watershed. PC2 accounted for 20% of the total variance; this component was related to nutrient parcel of water pollution. The component PC3 explains 13% of total variance and concentrates sediment influence on the water quality. PC4 accounted for 11% of the total variance and it was related to physical parameters. The fifth component (PC5) explained 11% of the total variance and it was associated with microbial pollution in the river.