Introduction

Water quality monitoring networks are essential to understanding and managing freshwater systems in the face of global anthropogenic impacts. While researchers and managers have monitored local water quality in some areas since the late 1800s (Worrall et al. 2015), the first formally designed water quality networks were established only in the late 1960s (Harmancioglu et al. 1998;Strobl and Robillard 2008). These networks should allow the detection of both spatial and temporal trends (Tavakol et al. 2017; Calazans et al. 2018; Peña-Guzmán et al. 2019), and possibly the determination of reference conditions to control and regulate human activities (Strobl and Robillard 2008; Cunha et al. 2011). Such networks can provide important information for establishing and evaluating policies to improve water quality, as well as linking costs and benefits of ecosystem services to specific management actions (Nel et al. 2009; Pynegar et al. 2018). Such networks have the general objectives of 1) accurately monitoring water quality of a specified region, 2) identifying problematic areas, and 3) accomplishing these first two goals as efficiently as possible.

Surface freshwaters (e.g., rivers and streams) provide resources for human activities, with specific water quality requirements associated with different uses (e.g., irrigation, navigation, energy generation, and water supply). The natural concentrations of solutes in the aquatic systems are influenced by their transport, cycling, and retention caused by physical, chemical, and biological processes as they move through watershed (e.g., atmospheric deposition, surface runoff, weathering of rocks, biological uptake, sorption, and desorption) (Meybeck and Helmer 1989; Stream Solute Workshop 1990). Intensification of anthropogenic activities can cause rapid land use shifts and create significant impacts on ecological processes (Meybeck 2003). Watersheds integrate large terrestrial areas, perturbations are transmitted downstream, and many rivers are dammed. Thus, water quality characterization is complex in drainage networks (Bostanmaneshrad et al. 2018; Rodrigues et al. 2018). Consequently, representative and well-structured monitoring programs are necessary to support water resources management.

Monitoring strategies in many countries were historically dependent on logistical aspects and based on subjective professional judgments for defining the temporal and spatial components of the water quality monitoring network (WQMN). This approach can neglect specific hydrologic aspects of the monitoring area and lack clear definitions of the monitoring goals (Harmancioglu et al. 1998; Strobl and Robillard 2008; Mei et al. 2011; Mavukkandy et al. 2014). These deficiencies, coupled with limited feedback and evaluation routines, frequently lead to inefficiency and production of monitoring data with poor cost-benefit relationships. Adaptive management approaches can address such inefficiencies and respond to changing conditions. Regular reassessment of a WQMN allows identification of adjustments required because of data mining problems (e.g., insufficient, missing, or unreliable data), environmental changes, availability of new monitoring technologies, and potentially changing monitoring goals (Harmancioglu et al. 1998; Strobl and Robillard 2008).

Researchers have previously assessed existing WQMNs (e.g., Mei et al. 2011; Do et al. 2013; Calazans et al. 2018; Peña-Guzmán et al. 2019), including the revision of monitoring sites locations, sampling frequency, and water quality parameters measured (Jiang et al. 2020). Numerous techniques have been applied for these purposes, including multivariate statistical analysis (Ouyang 2005; Kovács et al. 2015; Calazans et al. 2018; Peña-Guzmán et al. 2019), entropy analysis (Mahjouri and Kerachian 2011), genetic algorithms (Park et al. 2006), geospatial analysis (Strobl et al. 2006), water quality modeling (Chen et al. 2012), analytic hierarchical process (Do et al. 2013), and fuzzy logic (Chang and Lin 2014). Although there are different methods available to optimize WQMN, the previous studies did not simultaneously integrate (or only partially integrated) some relevant aspects of the WQMN planning on the optimization methods, such as the monitoring goals and the environmental representativeness of the monitoring sites. Thus, providing a flexible approach to fill this gap was a motivation for the present study.

An efficient river WQMN is especially crucial in developing countries, where financial resources are usually scarce, but population growth and water quality degradation are rapid (Capps et al. 2016; Ma et al. 2020). However, technical guidelines for planning WQMN in such countries are still limited, potentially leading to high monitoring costs and insufficient data to guide water resources management programs (Harmancioglu et al. 1998; Mavukkandy et al. 2014; Camara et al. 2020). All these challenges are present in Brazil, where limited sanitation infrastructure (e.g., 27% of the population with no sewage collection and treatment, ANA 2017) represents the main cause of water pollution. According to the Brazilian Water Regulatory Agency (ANA, “Agência Nacional de Águas e Saneamento Básico”), 12% of all the 8,863 surface water samples collected in the country in 2015 had poor water quality (ANA 2017). Despite the water quality degradation, 8 out of the 27 Brazilian states had no active WQMNs until 2016 (ANA 2018). Sociological and biological issues can lead to conditions that vary from those in more developed areas such as North America and Europe. For example, riparian zones in small streams can dominate water quality in some watersheds (e.g., Dodds and Oakes 2008).

Other point and nonpoint pollution sources affect Brazilian rivers’ water quality, in addition to sewage contributions, especially nutrient and pesticide inputs from agricultural areas (Maillard and Pinheiro Santos 2008; de Mello et al. 2018). However, urban development in tropical cities without centralized sewage treatment can lead to greater influences of riparian areas further down in the watershed (Tromboni and Dodds 2017). Additionally, sediments from deforested areas (Taniwaki et al. 2017), metal contamination from mining (Veado et al. 2006), industrial wastewater discharges from fertilizer, leather, metallurgy, and sugar/ethanol production (Schulz and Martins-Junior 2001; Gunkel et al. 2007; Martinelli et al. 2013; Silva et al. 2016) can jeopardize the water quality. These differing sources of impact on water quality demand different approaches to WQMNs.

São Paulo State (Southeastern Brazil) is the most populous in the country (IBGE 2019). The state has a large diversity of potential influences on water quality, some particular to tropical or subtropical areas, and others to rapidly developing countries. For example, it includes a megalopolis (the city of São Paulo and surrounding municipalities, with over 21 million inhabitants) and numerous other cities from small to over a million people in each. The state has extensive agriculture, industry, forest harvest, reservoirs, and other land developments. Yet, there are still reference areas, and several protected areas exist. Typical of many developing countries, untreated sewage discharges are still the main cause of local surface water pollution (ANA 2005; CETESB 2019). São Paulo State is attempting to address these water quality challenges. In 2016, the São Paulo surface WQMN run by the State Environmental Company (CETESB, “Companhia Ambiental do Estado de São Paulo”) had almost 500 monitoring sites across the state, with a bimonthly or quarterly frequency, and quantification of up to 60 water quality parameters in each site (CETESB 2019). This represents a site density around 2.0 sites/1,000 km2 (CETESB 2017), which is six times above the national average (ANA 2018), but still lower than in the European Union (7.0 sites/1,000 km2) (European Commission 2010). Traditionally in Brazil, the WQMN and the streamflow networks run separately and are not integrated. Such separation is also frequently observed in the São Paulo State, leading to unpaired water quality data and river/stream information on discharge, water velocity and level. Thus, in this paper we concentrate on the water quality network with the realization that future work should harmonize hydrologic and water quality monitoring station locations. The São Paulo WQMN has a strong concentration of sampling sites in the eastern portion of the state due higher anthropogenic pressures, more severe degradation of water resources, and also to logistical aspects (e.g., the proximity of analytical labs to process the samples) (Midaglia 2011). Thus, the development of a proposal to promote spatial optimization of the São Paulo’s WQMN is essential and could be a starting point for similar initiatives in other areas with common challenges.

Here we assess São Paulo’s WQMN with respect to its spatial representativeness with a more detailed focus on the regional scale and based on specific statistical procedures and criteria. We used both multivariate statistics and stratified sampling strategy technique and defined clear objective criteria to assess redundancies in the current network and potential under-represented areas not well covered by the network. We hypothesized that the evaluation of the studied WQMN would show spatial imbalances with biases toward specific areas. We expect our workflow and methodology could be useful in sub-tropical regions and could be particularly useful in other developing countries with financial constraints that aim to improve their WQMNs.

Materials and methods

Study area

São Paulo is the most industrialized state in Brazil (IBGE 2020) (Fig. 1), and covers an area of approximately 248,200 km2. The population of about 45 million inhabitants is strongly concentrated on the east portion of its territory (CETESB, 2019). According to the Köppen-Geiger classification (Kottek et al. 2006), the dominant climate is tropical wet with dry winters (Aw), with relatively high average annual air temperatures (15-25°C), and average annual rainfall between 1,250 and 2,250 mm (De Souza Rolim et al. 2007). The main land use is agricultural (40%), while approximately 3% of the state are covered by artificial areas (built-up) and 11% by forested areas, especially tropical ombrophilous forest (Kronka et al. 2005).

Fig. 1
figure 1

Map of Brazil with the São Paulo State in gray (left) and map of the São Paulo State with the studied UGRHIs colored (right)

The São Paulo State is divided into 22 water resource management units (UGRHIs, “Unidades de Gerenciamento de Recursos Hídricos”) for administrative purposes, based on its watersheds and similarities of environmental features (geomorphology, geology, regional hydrology, and hydrogeology). The present study focused on seven UGRHIs (Fig. 1) characterized by different predominant land uses, as agriculture (UGRHIs 09, 14, and 15), artificial (i.e., modified and urban) areas (UGRHI 06), and grassland or forest vegetation (UGRHIs 01, 03, and 11). The selected UGRHIs represent 32% of the area, and more than 55% of the population of the state, ~ 25 million inhabitants (Table 1).

Table 1 Specific information about the studied UGRHIs and the São Paulo State in general, including total area (km2), population (103 inhabitants), population density (inhabitants/km2) and predominant land uses in terms of area (Adapted from CETESB 2019)

Database

Data on Escherichia coli (E.coli), pH, biochemical oxygen demand (BOD), total nitrogen (TN), total phosphorus (TP), turbidity, total solids (TS), temperature, and dissolved oxygen (DO) from 2004 to 2018 were compiled for 160 river or stream monitoring sites. These sites were monitored by the CETESB in the São Paulo state (InfoÁguas Online System) in the UGRHIs 01, 03, 06, 09, 11, 14 and 15 (Table 1), with a bimonthly sampling frequency. The laboratory analyses were performed by CETESB (with laboratory accreditation by the National Institute of Metrology, Standardization and Industrial Quality) following Standard Methods (APHA 1998; APHA 2005). The parameters were selected because they are used for the calculation of the Water Quality Index (WQI) in Brazil, which is an adaptation of the index developed by the National Sanitation Foundation of the United States (Finotti et al. 2015). Such index is broadly applied to classify surface water quality (Noori et al. 2019). It was developed to provide an overview of the surface water quality, to compare stream conditions on different spatial and temporal scales, and to measure the progress of water quality programs (Brown et al. 1970). The WQI values range from 0 to 100, with five water quality categories: very poor (WQI ≤ 19), poor (19 < WQI ≤ 36), regular (36 < WQI ≤ 51), good (51 < WQI ≤ 79) or excellent (WQI > 79). The WQI is calculated by multiplying weighted scores for these nine parameters (CETESB, 2019). The weights for each parameter were defined based on the feedback of a group of water quality experts, with values ranging from 0.08 (turbidity and total solids) to 0.15 (E. coli). The scores are obtained from variation curves which were also defined by a group of water quality experts. It is important to take into consideration that our study accounts only for the nine parameters selected, used for the WQI calculation. Other measurements included in CETESB’s monitoring program (e.g., metals, chlorophyll, pesticides, and toxicity), were not evaluated in this paper. However, the WQI variables used here are common indicators of water quality applied worldwide.

The raw database was initially submitted to a data treatment procedure for the selection of appropriate monitoring sites and parameters to perform further multivariate statistics. We detail the workflow of this procedure in Online Resource 1. All WQI parameters and 143 monitoring sites across 100 rivers and streams were considered suitable to the subsequent analyses (Table 2), and are hereafter referred to as “approved database”. The only exception was in UGRHI 11, where BOD and TN data were not consistent due to many censored data points. BOD was not used for the cluster analysis, and TN was replaced by the sum of total nitrogen Kjeldahl and nitrate in UGRHI 11.

Table 2 Summary of the data treatment, with the initial number of monitoring sites, initial site density, number of sites with suitable data for cluster analysis, and number of rivers and streams represented by sites with appropriate data for cluster analysis

Using primarily public sources (i.e., freely available), cartographic data were gathered to support the stratified sampling strategy. The cartographic variables (see details at Table S1 from Online Resource 2) were land use, average annual rainfall isohyets, soil types, hydrography, watershed coded by the Otto Pfafstetter method, protected areas, and digital elevation model (DEM). Contrasting time and spatial scales represent limitations from our cartographic database, and these constraints in data availability could be rectified in the future to further refine our analysis.

The multivariate statistics were performed in OriginPro 2016® and MATLAB 2015a, while geospatial analyses were performed in ArcGIS 10.3®.

Redundancy identification

For each UGRHI, cluster analysis identified the degree of redundancy of the monitoring sites in relation to the studied WQI parameters, as recommended by Mavukkandy et al. (2014), CCME (2015), Calazans et al. (2018) and Peña-Guzmán et al. (2019). The agglomerative hierarchical clustering method was selected due to the advantages described in previous studies (Namratha and Prajwala 2012; Gomes et al. 2014). The water quality parameters were first transformed into logarithm base. After this step, the medians at each site for each parameter were obtained and standardized to Z-scale, which were the input data for the cluster analysis.

Linkage methods and similarity measures were chosen based on the cophenetic correlation coefficient (Sokal and Rohlf 1962), which has been employed to evaluate the performance of different clustering techniques (da Silva and dos Dias 2013; Saraçli et al. 2013). The cophenetic coefficients estimate the degree of distortion of the original distances between objects after the outputs of the cluster analysis, with a range from 0 to 1, and best results represented by higher coefficients (Saraçli et al. 2013). Single, complete, average, median, Ward, and centroid linkage methods were tested associated with Euclidean and squared Euclidean distances.

The clustering criteria were based on the Silhouette Index, which can estimate the goodness-of-fit and validate the number of groups formed (Rousseeuw 1987). This index promotes comparisons between the dissimilarity of one object to its own group and the dissimilarity to other groups. A given object is well classified if the internal dissimilarity is low, but the external one is high. An average Silhouette Index greater than or equal to 0.71, as well as absence of negative values, were the criteria considered to delineate the most appropriate number of groups (Kaufman and Rousseeuw 2009).

The water quality monitoring sites allocated in the same group were considered to produce redundant information in relation to the WQI parameters. However, while the cluster analysis indicates the monitoring sites eligible for exclusion following a statistical approach, the method is not able to evaluate if these sites encompass different monitoring goals and therefore should not be excluded. Thus, we established the monitoring goals for each site to avoid network optimization based only on the statistical approach.

Definition of the monitoring goals

Four possible monitoring goals were considered: trend analysis, regulation, establishment of reference (pristine) conditions, and water body representativeness, with the respective criteria presented in Table 3. These goals are typically considered in surface water quality monitoring (Bartram and Ballance 1996; CCME 2015; Arle et al. 2016) and are bracketed by the general goals presented by CETESB (2019) for the São Paulo State WQMN.

Table 3 Water quality monitoring goals and the respective classification criteria

Following the group formation in the cluster analysis and the assignment of the monitoring goals for each site, the following criteria were used to select monitoring sites that should not be excluded even with the statistical redundancy indicated by the cluster analysis:

  1. 1)

    In each group, keep the monitoring site which satisfies more goals;

  2. 2)

    Keep all monitoring sites associated to the regulation goal;

  3. 3)

    In each group, keep the monitoring site with the largest drainage area that meets the establishment of reference goal. For sites that do not meet the data requirements for the cluster analysis, keep all that satisfy the establishment of reference goal.

  4. 4)

    In each group, keep the monitoring sites with the largest drainage areas if another site located at the same river was not selected by previous criteria. The drainage areas data were obtained from the Hidroweb system from ANA. For sites with the absence of this information, the drainage area was calculated with the Hydrology tool from ArcGIS 10.3®.

Monitoring network representativeness evaluation and spatial update proposal

We performed the representativeness evaluation with a stratified sampling strategy, which is based on the division of each UGRHI’s area into subgroups called strata. These spatial strata can be identified by the combination of relevant features for the study area, demonstrating high internal homogeneity (Gilbert 1987; Dobbie et al. 2008). The main advantages of this technique are the reduction of data redundancy and unsampled areas (Haining 2015); good performance to represent average and variance of data from heterogeneous areas (Cochran 1977; Wang et al. 2010); and unbiased sampling (Catherine et al. 2008; Dobbie et al. 2008).

The first step to create the strata was the management unit definition. Level 6 basins coded by the Otto Pfasfstetter method were considered because it is the standard unit in the Brazilian Water Resources Policy (CNRH 2002) and recommended by the European Union (de Jager and Vogt 2010). For the strata definition, we performed cluster analysis, as suggested by Danz et al. (2005) and Catherine et al. (2008) for grouping basins based on relevant environmental features. The clustering methodology was the same described above, and the input data was the log-transformed percentages for each land use category (“anthropogenic factor”), average annual rainfall isohyets (“hydrological factor”), and soil type (“geological factor”) to reduce kurtosis and asymmetry. The use of average annual rainfall isohyets to define strata was possible after spatial interpolation by Topo to Raster tool from ArcGis 10.3®, followed by reclassification into three categories: below average rainfall, average rainfall, and above average rainfall. The rainfall values ranging between 90% and 110% of average annual rainfall were classified as “average rainfall”.

Each group of Pfafstetter level 6 basins formed in the cluster analysis (strata) represents different environmental features for which monitoring sites can be allocated. However, as advised by Danz et al. (2005) and Dobbie et al. (2008), a large number of groups formed by small units could make the water quality monitoring program impracticable and produce a high level of redundancy. Thus, we selected as representative for the trend analysis goal (see Table 3) the strata with a minimum area of 10% in comparison to the largest strata.

Whenever the existing monitoring sites were insufficient to represent all strata for trend analysis, priority areas for network expansion were investigated. The proposal consisted of indicating rivers reaches with Strahler order (Strahler 1952) greater than or equal to three and obviously located in the unrepresented strata. We considered rivers with order greater than or equal to three because they integrate larger areas of land, and the buffering mechanisms of downstream portions of a river network reduce the influences in water quality by unusual local conditions (Wohl 2017). However, it was not the objective of our study to define the micro-location of sites because this is mostly constrained by field inspection, taking into account access and sampling safety, mixing conditions, point sources of pollution, presence of discharge monitoring sites etc (Sanders 1988).

The present study also identified representative strata for sites aiming at the establishment of reference conditions (Table 3). We indicated areas with low human disturbance, representative of the trend analysis strata, and with small drainage areas, as recommended by Helmer (1994) and WMO (2013). We generated such strata by overlapping the layer of trend analysis strata with more than 50% of the area covered by forest vegetation or grassland with the layer of protected areas at São Paulo State. The protected areas are instituted by law to protect the natural resources within its own limits (Brasil 2000). In São Paulo State, they are frequently designed to promote the maintenance or improvement of rivers and streams water quality with a focus on water supply (Mello-Théry 2011; Dib et al. 2020), playing an important role as buffer zones to reduce the impacts of anthropogenic activities on water quality (e.g., nutrients and organic pollutant loads, sediments inputs) (Kuhlmann et al. 2014; Cunha et al. 2016).

Similarly to the protocol for the trend analysis goal, when the existing monitoring sites were insufficient to represent all strata for the establishment of reference conditions, a proposal for priority areas for network expansion was presented. This consisted of rivers reaches with Strahler order (Strahler 1952) less than or equal to two located in the unrepresented strata. We prioritized lower Strahler order reaches in this case because the increase of the drainage areas reduces the probability of identifying reaches with low human disturbance (Dodds and Oakes 2004), making it more complex to detect the effects of land use changes on surface water quality (Thomas et al. 2004). We also evaluated if the suggested river reaches for network expansion to trend analysis goal intersected the unrepresented strata for the establishment of reference. In positive case, these reaches were considered sufficient to meet both goals.

The final WQMN spatial update proposal was composed by the monitoring sites selected on the steps of redundancy identification, goals definition, and representativeness evaluation, with the addition of suitable rivers reaches for new monitoring sites that meet the trend analysis and/or the establishment of reference goals. Our study did not aim at proposing new areas for the regulation goal because it represents specific demands from the environmental agency (e.g., regulation of point source pollution and applied studies). New sites for the goal of water body representativeness were not proposed either because this goal only differs from the trend analysis goal by the duration of the data series. Finally, as a preliminary assessment of our proposal reliability, we ran the Mann-Whitney nonparametric test with a significance level of 0.05 for each WQI parameter considering the approved database in each UGRHI. This hypothesis test is frequently employed to statistically indicate if two groups belong to the same population or not. Thus, we compared the data series from the previously existing monitoring sites with the selected monitoring sites to look for statistical difference or similarity on the data structure before and after the spatial optimization. The workflow is summarized in Fig. 2.

Fig. 2
figure 2

Workflow summary for the spatial update proposal for each water resources management unit (UGRHI)

Results

Approved database

Following the initial data treatment procedure, more than 76,000 data points for the nine water quality parameters were approved, resulting in an average of ~1,200 data points per parameter in each UGRHI (Table 4). The descriptive statistics (median and percentiles) suggested a considerable variation in the water quality across the selected UGRHIs. In addition, data availability was heterogeneous due to different monitoring site densities and to the contrasting starting dates of operation.

Table 4 Total number of data points, missing data, censored data, median, 10% percentile, and 90% percentile for data on Escherichia coli (E. coli), pH, Biochemical Oxygen Demand (BOD), total nitrogen (TN), total phosphorus (TP), temperature (T), turbidity (Turb), total solids (TS), and dissolved oxygen (DO) for all water resources management units (UGRHIs)

Redundancy identification

For all the UGRHIs, the cophenetic correlation coefficient indicated the Euclidean distance associated with average, single, and complete linkage methods provided the lowest distortions of the original distances (higher coefficients). The average linkage had the best performance so we considered it the most appropriate method in five of the seven UGRHIs (Table 5). The cluster analysis suggested sampling sites produced redundant information in six of the seven UGRHIs, in which percentages of redundant sites varied from 13 to 59% (Table 5). The numbers of clusters formed in each UGRHI varied from 4 to 40, reaching the maximum in the UGRHI with the highest population density and the worst water quality (UGRHI 06). No redundancy was observed for UGRHI 01, with an average Silhouette Index of 1.00, indicating that all monitoring sites were classified in different groups.

Table 5 Summary of the cluster analysis results, with most appropriate linkage method, cophenetic correlation coefficient, average Silhouette Index, number of groups formed, and number of sites with redundancy for each water resources management unit (UGRHI)

Definition of monitoring goals

The most common monitoring goal we identified was the trend analysis, with 108 monitoring sites (Table 6) concentrated (more than 70%) in the three UGRHIs with the highest population density. On the other hand, the less represented goal was the establishment of reference, with only eight monitoring sites meeting this goal in four UGRHIs. Half of these sites were located in the UGRHI with the largest forest cover (UGRHI 03). The regulation goal was relevant in the network, with more than 21% of monitoring sites meeting this goal. The UGRHI with the highest number of monitoring sites for regulation (UGRHI 09) also presented the highest redundancy (Tables 5 and 6).

Table 6 Results for monitoring goals definition, with the number of sites that met each goal (trend analysis, regulation, establishment of reference conditions, and water body representativeness) and the number of sites that could be excluded based on redundancy to WQI parameters and monitoring goals. A monitoring site could meet more than one goal at the same time

Considering the monitoring goals, the potential reduction in the number of sites in the UGRHIs with redundancy ranged from 12 to 46% of the existing monitoring sites. In general, the increase in site redundancy was followed by an increase in the potential of site reduction, regardless the definition of the monitoring goals for each site. This analysis indicates that of the initial 160 sites, 37 (23%) could be excluded, considering the redundancy in relation to the WQI parameters and also the established goals.

Monitoring network representativeness evaluation and spatial update proposal

The stratified sampling strategy showed that the number of strata for trend analysis were different across the UGRHIs, with a minimum of three and a maximum of 64 strata (Table 7). In general, the UGRHIs with larger area had a greater number of strata, except for UGRHI 15 that had the highest number of strata but was not the largest in terms of area. The representativeness evaluation for trend analysis strata indicated that strata were more represented by existing monitoring sites in smaller UGRHIs (up to 100%), while strata representativeness below 20% was observed in larger UGRHIs.

Table 7 Results for stratified sampling strategy for each water resources management unit (UGRHI), with the number of identified strata for trend analysis goal and the number of identified strata for the establishment of reference goal. Representativeness evaluation is also presented for each UGRHI, with strata for trend analysis goal already represented by the existing monitoring sites and strata for the establishment of reference goal already represented by the existing monitoring sites

More than 89% of the strata for the establishment of reference goal were concentrated in the two UGRHIs with more than 57% of the area covered by protected areas (UGRHIs 03 and 11). In addition, less than 35% of the strata for the establishment of reference were represented by the existing monitoring sites. For the establishment of the reference goal, the identified strata showed that four out of the seven UGRHIs presented eligible river reaches to have reference monitoring sites (Table 7).

The final proposal for network expansion showed different patterns regarding the continuity of the existing monitoring sites and the number of suitable strata for network expansion in each UGRHI (Table 8). The continuity of the existing monitoring sites ranged from 63% to 100% of initial sites, with the lowest percentage in the UGRHI with the highest monitoring site redundancy (UGRHI 09). The number of suitable strata for network expansion ranged from 0 to 53 for trend analysis, 0 to 3 for the establishment of reference, and 0 to 15 for both goals. The UGRHI that had all strata represented was also the only one that had no redundant sites (UGRHI 01).

Table 8 Results for the spatial update proposal for each water resources management unit (UGRHI), with existing monitoring maintained on the final proposal, number of strata for trend analysis goal suitable for network expansion, number of strata for the establishment of reference goal suitable for network expansion, number of strata suitable for network expansion with rivers reaches that meet both trend analysis and establishment of reference goals, and final proposal for monitoring site density

The final proposal suggested the reduction of the monitoring site densities up to 12% in the four UGRHIs with the highest initial monitoring site densities and the highest population densities. On the other hand, expansions in the number of monitoring sites from +125% to +390% were suggested in the three UGRHIs with the lowest population densities and the lowest initial monitoring site densities. The final monitoring site densities varied from 1.6 to 13.4 sites/1,000km2 (Table 8). From the initial 160 existing monitoring sites, 126 (79%) were maintained and 127 new potential sites were identified (assuming only one monitoring site for each unrepresented strata). Interestingly, the results for the Mann-Whitney nonparametric tests (α=0.05) showed that the proposed exclusion of monitoring sites did not significantly change the data series for most of the studied UGRHIs and parameters. This indicated that the data series before and after optimization remained similar even with the exclusion of monitoring sites. Out of the 62 comparisons for the WQI parameters, only 10 (16%) presented a significant statistical difference considering the data series before and after the spatial optimization. The UGRHIs 06 and 09 presented all the observed differences, with seven and three in total, respectively. For the UGRHI 06, only pH and temperature did not present statistical differences, while the UGRHI 09 showed differences for BOD, temperature, and DO.

The final spatial update proposed for the UGRHIs 01, 03, 06, 09, 11, 14, and 15 (see an illustrative map in Fig. 3) is fully available in Online Resource 3 (Figures S1 to S7). For the UGRHIs with suitable areas for network expansion, new monitoring sites in the identified rivers reaches have the potential to promote better spatial homogeneity and representativeness.

Fig. 3
figure 3

Initial distribution of existing monitoring sites in UGRHI 06 and the respective groups formed after the cluster analysis (left). The final proposed network spatial structure is shown (right) with the selected sites from the existing monitoring network and also the priority river reaches for network expansion (if applicable)

Discussion

The monitoring site redundancy with respect to the selected WQI parameters was relatively high for most of the studied UGRHIs, frequently above 20% (Table 5). Redundancy is also common in the WQMN of other developing countries (e.g., India, Colombia, and Iran), where more than 40% of water quality monitoring sites with high similarity were reported (Mavukkandy et al. 2014; Tavakol et al. 2017; Peña-Guzmán et al. 2019). This situation coupled with under representation of some watersheds can lead to WQMNs with non-optimal cost-benefit relationships (Peña-Guzmán et al. 2019; Camara et al. 2020) that may fail to identify spatial and temporal changes in the water quality (Mahjouri and Kerachian 2011; Chen et al. 2012). Our results here indicate a perspective for spatial reassessment of the São Paulo State WQMN to reduce operational costs with redundant monitoring sites, to allow the expansion to areas with lack of data, and to promote a better coverage of the proposed monitoring goals.

The number of sites associated to the regulation goal (Table 6) contributed to the redundancy of the network. Such sites were mainly located to regulate point sources of pollution and not designed to provide spatial representativeness of water quality in the watersheds. However, we highlight that the parameters not analyzed in this study, such as heavy metals and toxicity, may throw another light in the importance of point source regulation sites. According to the land use cartographic base, the artificial areas represent less than 4% of the UGRHIs’ area. However, more than 65% of the existing monitoring sites are located up to 2 km far from the artificial areas’ polygons. Such spatial concentration may lead to a low representation of the identified strata for some of the selected UGRHIs (Table 7), especially for those more heterogeneous in relation to the environmental features. We highlight the case of the UGRHI 09 that presented the highest number of sites meeting the regulation goal, and 12 monitoring sites in the same river (Mogi-Guaçu River). This UGRHI had the highest statistical redundancy (59%) associated with the highest potential of exclusion of existing monitoring sites (37%).

The lack of monitoring sites for the establishment of reference conditions is another common deficiency in WQMNs worldwide. Such issue is frequently attributed to the rapid development of human activities and the consequent changes in river water quality (Cunha et al. 2011; Davies-Colley et al. 2011; Huo et al. 2013; Almeida et al. 2014; Kaboré et al. 2018). Researchers have developed several methodologies to estimate the reference conditions to alleviate the problem of defining monitoring sites in rivers and streams with low human disturbance in developed regions. These methods include the trisection method, statistical relationships, and models (e.g., Smith et al. 2003). However, most authors suggest that data collected at pristine sites provide the best indicator of baseline conditions since contrasting estimation methods could present differences up to 50% in the reference concentrations (Dodds and Oakes 2004; Huo et al. 2013; Hsieh et al. 2016). Our results reinforced the difficulty of finding suitable river reaches for establishing reference conditions. For example, we identified no strata for the establishment of reference in the predominantly agricultural UGRHIs (main land use of São Paulo State). These are characterized by the presence of large regions without protected areas, and intense deforestation as in other Brazilian agricultural areas in general (Sparovek et al. 2010; Calaboni et al. 2018). These areas associated with intensive agriculture have extensive use of pesticides and fertilizers, as well as land disturbance. These uses lead to problems such as high levels of nitrate, suspended solids, and turbidity that are found in São Paulo State agricultural catchments (Mori et al. 2015; Simedo et al. 2018).

The non-representativeness with respect to the reference sites could relate to the absence of this goal in the initial design of the São Paulo WQMN and to the concentration of sites in areas classified as artificial in the land use layer. Such monitoring strategy is possibly due to the need for water quality data in areas with higher water demand for human uses, associated with more conflicts on the use of water resources, coupled with financial and operational constraints. According to our results from the steps of goal definition and representativeness evaluation, only 12% of the existing monitoring sites met the reference goal. However, the WFD (2003) strongly recommends the design of networks aiming at establishing reference conditions in the different types of water bodies (classified according to geomorphology, drainage area, physicochemical parameters) as a tool to support the definition of the ecological status, and to assess the anthropogenic impacts in water bodies (Pardo et al. 2012; Voulvoulis et al. 2017) as these factors may alter baseline water quality as well as responses to anthropogenic pressures. This is a crucial concern for the São Paulo WQMN where the percentage of monitoring sites with poor and very poor water quality is above 16% since 2013 (CETESB 2019). Thus, our approach of selecting small streams with low disturbance as representative of the establishment of reference goal could be an important initiative towards the establishment of baseline conditions in the study area.

In the São Paulo State WQMN, all monitoring sites include at least the WQI parameters, always sampled in the same frequency. The same procedure is adopted at the National WQMN in Brazil. Nevertheless, the São Paulo State WQMN has a different approach for water supply sites and for evaluating aquatic life protection, including parameters such as cyanobacteria cells, chorophyll and ecotoxicity assays. A partially similar approach is followed by other WQMNs worldwide, which have flexible monitoring methods depending on the monitoring goals of each site, usually providing better cost-benefit (Strobl and Robillard 2008).

In Europe, the WFD (2003) suggested differentiating the monitoring methods according to the monitoring goals into “surveillance”, “operational”, and “investigative”, each with different frequencies and parameters. The USA follows a similar approach, where the National Water Quality Assessment Program divides efforts into customized projects that aim at monitoring the national status of water quality, establishing reference conditions, and assessing changes in water quality by natural or anthropogenic features (Coles et al. 2019). In 2013 ANA started to build a National WQMN, run by the states, leading to a revision of the São Paulo sampling sites in the last five years. The novelty was the incorporation of specific monitoring goals for network design (“impact”, “strategic”, and “reference) (ANA 2013) based on a larger scale methodology (basin levels 3 and 4 from the Otto Pfafstetter coding system). A more detailed design and different approaches based on environmental heterogeneity, and goals, such as proposed in this paper, could result in a more regionalized São Paulo State WQMN. Moreover, once a balanced and representative WQMN is established, spot sampling can fill gaps and deal with finer scale determinations, or control actions if necessary. Thus, our results regarding the proposed monitoring goals could be a starting point towards a water adaptive management in São Paulo and Brazil in general.

Our final proposed network structure presented an average site density weighted by the UGRHIs’ areas of 3.2 sites/1,000 km2, which is compatible with the minimum site density of 1.0 site/1,000 km2 recommended for the National WQMN (ANA 2013). This is comparable to other proposals (Table 9) that followed contrasting methodologies and considered different objectives for the WQMN optimization in the United States (Ouyang 2005), China (Ning and Bin 2004), Brazil (Calazans et al. 2018), and Malaysia (Camara et al. 2020), where average site densities varied from 2.6 to 4.6 sites/1,000km2. However, other studies (Table 9) indicated contrasting densities as in watersheds of Hungary (Tanos et al. 2015), Colombia (Peña-Guzmán et al. 2019), Iran (Mahjouri and Kerachian 2011), and China (Wang et al. 2020), with average site densities from 0.1 to 42.2 sites/1,000 km2. In general, for most studies compiled in Table 9, the higher the initial monitoring site density, the higher the final proposed monitoring site density. We did not observe this pattern in most of the studied UGRHIs, as greater initial density did not necessarily lead to greater proposed final density of sites. Such differences are probably related to the evaluation of spatial representativeness based on environmental features we performed in contrast to most of the studies in Table 9. Our analyses also showed that site density can vary according to the spatial heterogeneity and monitoring goals, resulting in densities from 1.6 to 13.4 sites/1,000 km2 across different UGRHIs.

Table 9 Overview of some studies on water quality monitoring network optimization. For each reference, study area (river/country), watershed characteristics, optimization methods, drainage area, initial number of monitoring sites, proposed number of monitoring sites, initial site density, and proposed site density are shown

The data series of the existing monitoring sites versus the optimized sites were consistent for most of the UGRHIs, with no significant statistical differences for the WQI parameters. This indicated that the data structure was not significantly changed after the spatial update proposal, suggesting that the optimized network could provide a similar degree of information even with a reduction of 21% of the total number of monitoring sites. Only UGRHIs 06 and 09 presented statistical differences for some parameters, but still the optimization does not necessarily imply the provision of statistically similar data series as the exclusion of redundant sites can significantly change the means of the WQI parameters and thus the structure of the dataset in general. We argue our methodology could be incorporated in other countries to promote a better link among network design, monitoring goals, and representativeness of land use, hydrological features, and geomorphological characteristics in the watersheds. However, if these data are used to simply cut back the total number of stations, no improvement would be seen. The resources would need to be re-allocated to create a better coverage of stations. This re-allocation should also consider overlapping water quality and hydrologic monitoring stations in areas where they are not congruent.

In the present study, there was a greater need for network expansions in the three UGRHIs with the lowest initial site densities and the lowest population densities. The UGRHI 15, despite not being the largest one, presented the highest demand for monitoring expansion, but also potential exclusion of 21% of the existing monitoring sites. This suggests the current spatial distribution of the monitoring sites is specially unbalanced in this UGRHI, which is highly heterogeneous to environmental features. Therefore, our results showed that the area was not the single driver of the demand for new monitoring sites. The number of identified strata and the spatial representativeness of existing monitoring sites also influenced projected needs for new sites. Chen et al. (2012) also reported a lack of spatial representativeness and suggested a need for relocation of more than 20% of existing monitoring sites in the Heilongjiang River in Northeast China. On the other hand, it is clear that the resulting expansion of 53 new sampling sites only in one UGRHI may be unrealistic due to the financial constraints and the increment in operational costs, but prioritizing river reaches for the expansion can be a starting point toward a more representative network.

The congruence of water quality and hydrologic monitoring sites is an additional consideration: in São Paulo state they are not always co-located. Integrating water quality and quantity is crucial for effectively evaluate and manage this resource. While baseflow conditions are extremely important for some aspects of water quality at each site (e.g., average conditions experienced by biota, average drinking water quality), considering high flow events can also be important. If the concern is mass of pollutants transported downstream, then flow weighting is necessary and co-location of hydrologic and water quality sites is particularly important.

The three UGRHIs with the highest population density had the highest potential reduction of the number of monitoring sites. High population density has a direct relationship with surface water pollution and can create unusual local conditions influencing water quality (Chen et al. 2016; Liyanage and Yamada 2017; Diamantini et al. 2018). Therefore, many authors recommend intensifying monitoring in such areas with poor surface water quality (Liyanage et al. 2016; Calazans et al. 2018). However, for the UGRHIs of interest, the strong presence of human activities generated a high number of monitoring sites meeting the same monitoring goals, leading to redundancy in relation to the WQI parameters. Brazilian regulation establishes five classes of water quality (Classes Special, 1, 2, 3, and 4), according to its expected uses, with specific water quality guidelines allowing increasing levels of pollution. In urban areas of the São Paulo State, poor quality river reaches (“Class 4”) are common and influence the quality of the same river downstream with higher water quality requirements, demanding a higher number of sites in the same river. Therefore, such legal point of view may partially explain the greater density of monitoring sites in urban areas. Although this was not one of the goals considered, our results confirm the initial hypothesis that the São Paulo State WQMN presents areas that may be either over or sub represented.

A crucial assumption of using homogeneous areas (e.g., strata, ecoregions) as a tool for the evaluation of spatial representativeness is that there is a spatial correlation between the water quality parameters and the environmental features used as input for the definition of the homogenous areas. Thus, the choice of inappropriate scales or layers that have no correlation with water quality can lead to misclassifications of water bodies (Bailey 2004; Cheruvelil et al. 2008) as well as a misrepresentation of the overall water quality. It is well known that hydrological, geological, and anthropogenic characteristics are important drivers of water quality composition (Khatri and Tyagi 2015; Igwe et al. 2017). However, there is a great variety of layers that represent such characteristics, and it would be impracticable to select all of them. In our study, we considered the land use (anthropogenic), the rainfall (hydrological), and the soil types (geological) as input features, but additional parameters could control water quality. Our choice was based on data availability for the studied area, and on previous investigations that showed high correlations between such features and surface water quality (Shehane et al. 2005; Taka et al. 2016; Simedo et al. 2018). Potentially relevant additional layers proposed in other studies include watershed morphometry (Catherine et al. 2008), atmospheric deposition, population density, sources of pollution (Danz et al. 2005), climate, and geology (Omernik et al. 2000).

Conclusions

Our study indicated that monitoring sites of the São Paulo WQMN are concentrated in areas with higher human disturbance levels. This result, using the proposed methodology, generated redundancies above 20% of the existing monitoring sites in most of the studied watersheds, thus raising the possibility for their exclusion, as well as pinpointed the lack of sites suitable for establishment of reference conditions. Especially in watersheds with more heterogeneous environmental features, such as UGRHIs 11, 14, and 15, a low spatial representativeness was established, with less than 35% of the identified strata represented. Moreover, we reported the possibility of exclusion of the existing monitoring sites in some of the UGRHIs, with a potential reduction of up to 37%. The UGRHIs with the highest population densities presented potential reduction of the number of monitoring sites of up to 12%, while the UGRHIs with the lowest population densities presented a potential of expansion up to 390%, resulting in final monitoring site densities from 1.6 to 13.4 sites/1,000km2. We argue that the WQMN optimization does not necessarily imply reduction in the number of monitoring sites. In practical terms, the need for expansion of the network is even a more critical aspect in comparison to the elimination of redundant monitoring. Redundant monitoring sites still provide data to support water quality management in the represented area. Conversely, areas with a lack of monitoring sites can provide limited data to water quality programs. Obviously, the feasibility of such expansions needs further evaluation from the environmental agency because it brings a financial impact, and additional criteria to prioritize the river reaches for expansion could be necessary, such as population density and/or strata’s area. We argue that the WQMN optimization is especially necessary for developing countries like Brazil, where i) there are financial constraints for the investments in design, operation and maintenance of the WQMN; ii) surface water quality is strongly affected by point and nonpoint sources of pollution; iii) fast population growth and limited sanitation infrastructure exacerbate the presence of conflicts for water uses (e.g., irrigation, energy generation, public water supply, industry).

Although there is no scientific consensus about the best method for the WQMN spatial optimization, the cluster analysis followed by monitoring goals definition and stratified sampling strategy presented here provided consistent results with other studies in literature. The main advantage of our approach is the consideration of relevant environmental features affecting water quality (anthropogenic, hydrological, and geological aspects) in a more detailed scale to guide proposed reduction or expansion in the number of monitoring sites towards increase representativeness, while taking into account the monitoring goals. Our approach could benefit from the integration between water quality data, streamflow, and other physical data (e.g., discharge, water velocity and river/stream morphometry). Because such integration is not common in the São Paulo State and in Brazil in general, data was not available, and these aspects were not considered in the redundancy analysis, but we are aware that this is another important criteria to be included in WQMN optimization. We expect our flexible methodology could serve as a basis for the WQMN optimization in other developing countries, with the possibility of adapting the parameters for cluster analysis, the monitoring goals, and the input features for the stratified sampling strategy procedure. Future research initiatives are still needed to indicate the efficiency and advantages of the proposed strategy. Further comparisons with widely used optimizing WMQNs methodologies could represent a starting point in this direction.