Introduction

With the rapid development of rural economy and population growth, combined pollution increasingly endangers the health of the aquatic ecosystem for most of river networks nowadays, especially for the suburban areas of large developed cities. Water quality of multistage rivers that act as the important and common hydraulic channels for collection and delivery of pollutants in the catchment scale has gradually deteriorated. To address such grim reality, the local government has promulgated and allocated a series of decrees and abundant funds. However, the actual practice is invariably short of scientific guidance. Consequently, low efficiency of remediation for polluted rivers as well as unsatisfactory input–output ratio of funds for watershed harnessing has always weakened the public’s confidence (Short et al. 2012). The watershed or subcatchment is usually considered as the appropriate spatial administrative unit (Su et al. 2011). Thus, besides source control measures, remediation of polluted river networks is also of high importance. In fact, in order to acquire the targeted control strategy and effective top-level design for watershed management, the first challenge is to analyze the fate of complex pollutants in river systems (Short et al. 2012).

In general, the organic matter plays an important role to represent the rivers’ water quality. Chemical oxygen demand (COD) and biochemical oxygen demand (BOD) are two widely used indicators that represent the content of reducing components (mainly the organic matter) in water. COD is further used as a gross control index for reducing the discharge of organic matter in watershed management. In addition, the use of total organic carbon (TOC) or dissolved organic carbon (DOC) can also reflect the levels of existent dissolved organic matter (DOM). However, traditional methods for the measurements of these water quality metrics are always time-consuming and may cause large reagent consumption. Moreover, these indexes only represent overall DOM quantity (a mixture of organic substances) with some special properties (Coble 2007; Korak et al. 2014) but are incapable of revealing its composition (different kinds of organic matter) and dynamics. Therefore, to fill the gap and synthetically track the organic matter loadings in rivers, fluorescent DOM (FDOM) has been suggested as an alternative useful indicator in recent years (Graeber et al. 2012; Lapierre and Frenette 2009; Meng et al. 2013). To avoid the limitations and controversies resulting from the traditional approaches such as “peak picking” and fluorescence regional integration (FRI) (Korak et al. 2014; Li et al. 2013b), excitation–emission matrix combined with parallel factor analysis (EEM-PARAFAC) technique was utilized in this study. This widely accepted method can provide well evaluation of identified fluorescence fractions from the overlapping fluorescence spectra in multifluorophore solutions (Ishii and Boyer 2012).

Our studied area herein is a moderate-sized subcatchment (1506 km2) suffering from the typical combined pollution (agricultural pollutants and domestic sewage, important FDOM sources) (Study area section and Fig. 1). The affiliated rivers are divided into three orders, which provide us with suitable research carriers to deal with the aforementioned scientific questions. Until now, there are numerous related studies concerning the FDOM in similarly polluted waters (Massicotte and Frenette 2011; Mostofa et al. 2013; Yao et al. 2011), but these previous works have focused primarily on the natural or anthropogenic FDOM as well as their direct effects on the receiving water body (Hudson et al. 2007). For agricultural diffuse pollutants, the content of diverse fluorescent materials and the ratio of protein-like and humic-like fluorescence for exudation water of animal slurry were obviously different from those for uncontaminated water (Naden et al. 2010). Wilson and Xenopoulos (2009) stated that the strong agricultural activities would reduce the structural complexity of FDOM and increase the amount of biodegradable FDOM in rivers. For anthropogenic diffuse pollutants, besides animal and fish breeding wastewater, the discharge of rural and urban domestic effluents can also mark the receiving rivers with distinctive protein-like fluorescence signature (Goldman et al. 2012; Hudson et al. 2007). Furthermore, FDOM has been demonstrated that it can track the dissolved humic-like materials in mountainous rivers (Chen et al. 2013) and can facilitate the effective management of water quality to control eutrophication (Zhang et al. 2011).

Fig. 1
figure 1

Location of the study area and distribution of the sampling sites (see text for details). SP source pollutants, ETR East Tiaoxi River, NTR North Tiaoxi River, MTR Middle Tiaoxi River, STR South Tiaoxi River, NTR′, MTR′, and STR′ the secondary tributaries, RW rainwater, CW clean water (headwater of rivers)

However, the potential structural disparity of FDOM and its subsequent fate in multistage rivers in the subbasin scale has not been unambiguously determined in extant literatures. The related previous studies have mainly focused on the small-scale catchments with single FDOM sources (Zhang et al. 2011) or the complex FDOM sources in great broad geographic scales like Lake Taihu watershed (Borisover et al. 2009; Massicotte and Frenette 2011; Yao et al. 2011). For our area studied, the East Tiaoxi River (ETR) watershed located in the southwest hilly area around Lake Taihu was considered as the relatively clean agricultural catchment. Actually, the pollution status in this subcatchment is much more complicated, and the knowledge of FDOM distribution in multistage rivers is still fragmented and inconsistent.

Herein, we hypothesized that the distribution of combined organic pollutants in multistage rivers might be revealed by the quantity and composition of FDOM (as a tracer) with the aid of EEM-PARAFAC technique. Accordingly, the following works were done from two aspects: (1) to investigate and evaluate the optical characteristics of FDOM in multistage rivers quantitatively and qualitatively at a representative subcatchment and (2) to explore the spatial heterogeneity of contamination levels in multistage rivers, the fate of FDOM along the migration directions, and their potential significance for watershed management.

Materials and methods

Study area

The study area (E 119° 30′–120° 8′, N 30° 3′–30° 33′), an integrated administrative unit for local government, is the upstream part of ETR watershed located in eastern coastal China (Fig. 1; Sect. S1 in Supplementary Materials). ETR is one of the largest inflowing rivers of Lake Taihu (Su et al. 2011). This particular shallow eutrophic lake has attracted great attentions in recent years because of its severe water quality degradation and algae bloom problems (Guo 2007; Yu et al. 2013). The stream orders of ETR are usually classified into three levels by the local water conservancy bureau, including the secondary tributaries (the first order), the primary tributaries (the second order), and the main stream (ETR, the third order) (Horton 1945; Li et al. 2013a). Three primary tributaries consist of the North Tiaoxi River (NTR), the Middle Tiaoxi River (MTR), and the South Tiaoxi River (STR) (Fig. 1). They all come from the Tianmushan Mountains with subbasin areas of 720, 229, and 310 km2, respectively. The secondary tributaries that belong to NTR, MTR, and STR (the second order) were labeled as NTR′, MTR′, and STR′ (the first order), respectively. Liban Reservoir (Fig. 1) is located near the mountaintop without any industrial and agricultural production influence. Additionally, the land-use classification and detailed description of combined pollutants in this area (agricultural and domestic wastewater, the main pollution sources) were shown in Sect. S1 in Supplementary Materials.

Sample collections

Water samples were collected on sunny days (except for five rainwater samples) in summer during the rainy season (Sect. S1 in Supplementary Materials) at 91 sites depicted graphically in Fig. 1. Most of them were from the rivers and reservoirs. Seven samples from the clean upstream rivers served as controls. Moreover, 12 supplemental water samples from the pollution sources were added. They mainly included the agricultural drainage (rice cultivation, fish, livestock, and poultry breeding), the decentralized rural domestic wastewater, and the effluent from secondary sedimentation tank in a municipal sewage treatment plant. Each of these samples gathered was stored in a 500 mL pre-cleaned polyethylene terephthalate (PET) bottle. After being transported to the laboratory, all the samples were filtered through 0.45-μm cellulose membrane filters (Millipore, Pittsburgh, PA) in the low-pressure suction filter device (Goldman et al. 2012). They were stored in the refrigerator at −20 °C and dark environment to avoid metamorphism (Zhang et al. 2011).

Fluorescence measurements and data pretreatment for PARAFAC analysis

The EEMs of water samples were measured with the FluoroMax-4 fluorescence spectrometer (HORIBA Jobin Yvon, Paris, France) in an air-conditioning room. We fixed excitation wavelength range from 250 to 500 nm and collected emission signals from 300 to 550 nm in both 5/5-nm intervals. Excitation wavelengths less than 250 nm were not considered due to the possible signal interference produced by the existence of nitrate (Jamieson et al. 2014) and fluctuation of light source (Stedmon and Bro 2008). The slit width for either excitation or emission monochromator was 5 nm, which could avoid the serious scattered light interference (Wilson and Xenopoulos 2009). The scanning speed and integral time were 80 nm s−1 and 0.1 s, respectively. EEMs were collected in a signal over reference (S/R) mode, and the instrument-specific excitation and emission corrections were applied (Mladenov et al. 2011). Other specific analysis methods were shown in Sect. S2 in Supplementary Materials. Then, several post-treatment steps for the raw EEMs were performed to make reasonable amendments for fluorescence spectra according to the requirements of the PARAFAC model. First, the EEMs were corrected using the ultraviolet (UV)-visible absorption spectrum from 250 to 700 nm to eliminate the primary and secondary inner filter effects (IFEs) (Sect. S2 in Supplementary Materials). Second, the well-known quinine sulfate unit (QSU) was introduced. The fluorescence intensity (FI) of EEMs was normalized using the standard quinine sulfate solution to correct the daily fluctuations due to lamp decay. One QSU stands for the FI of the solution of 0.01 mg L−1 standard quinine sulfate in 1 N-H2SO4 (qs) when the wavelengths of excitation/emission equal to 350/450 nm (Zhang et al. 2011). Third, the EEM spectrum of ultrapure water was subtracted in order to eliminate most Raman and Rayleigh scattering signals (Goldman et al. 2012). Then, the FI of two useless triangle areas (λem ≤ λex + 5 nm and λem ≥ λex + 300 nm) in each EEM was set to zero (Yao et al. 2011) to avoid the production of “false peaks” and accelerate the running process of PARAFAC model (Stedmon et al. 2003). Moreover, the negative values in Rayleigh scattering areas were replaced with zero (Kothawala et al. 2014). However, the above steps were arduous to remove the scattering signals completely without distortion. Thus, to remove the residual burrs effectively, the Delaunay triangular interpolation method was applied. Note that only the scattering region (an expanding symmetrical region of 20 nm when λ em equals to one, two, and three times of λex) was fitted by the adjacent data (Bahram et al. 2006). The corresponding algorithms named EEMSCAT v2.0 (University of Copenhagen, Copenhagen, DK) and PLS Toolbox v3.5.3 (Eigenvector Research, Inc., Manson, WA) based on MATLAB 2010b (MathWorks, Natick, MA) were used for calculations (Bahram et al. 2006).

PARAFAC modeling

DOMFluor toolbox (Stedmon and Bro 2008) based on MATLAB software platform was employed for the PARAFAC modeling of three-dimensional array (91 samples × 51 emission wavelengths × 51 excitation wavelengths). Two outliers (samples from the municipal wastewater treatment plant and farm product processing facility) were identified with the internal leverage diagnostic tool owing to their high loadings and leverage in the initial exploratory analysis. Non-negative constraints for the excitation/emission loadings and concentration scores were applied to the PARAFAC model because the negative values do not possess any practical meanings (Borisover et al. 2009). PARAFAC models were generated within two to ten components. Then, several diagnostic tools, including the residual analysis, core consistency, and split-half analyses, were employed to determine the optimal component number (Andersen and Bro 2003; Stedmon and Bro 2008). Finally, after being verified by the random initialization procedure, a robust and credible model was established (Stedmon and Bro 2008). As a result, the spectral characteristics of isolated components could be manifested by the excitation and emission loadings (Borisover et al. 2009; Zhang et al. 2011). The calibrated maximal fluorescence intensity (FImax) of the PARAFAC components was recommended to illustrate the quantitative information of FDOM between different samples (Massicotte and Frenette 2011; Stedmon and Markager 2005).

Drawing and statistical analysis

The sampling map of study area (Fig. 1) was plotted using ArcGIS 9.3 (ESRI, Redlands, CA). The correlation and linear regression analyses between PARAFAC components were assessed based on OriginPro 8 (OriginLab Corp., Northampton, MA). All average values were presented as the mean ± standard error. Before the multiple comparisons of mean values, the widely used Box–Cox transformations (Francis et al. 2009) for the non-normally distributed variables based on Minitab 17 (Minitab Inc., State College, PA) were implemented. Then, the transformed data (C1, C2, and C3, normally distributed verified by Shapiro–Wilk test, p > 0.05, and homogeneity of variance verified by Levene’s tests, p > 0.05) was calculated by one-way analysis of variance (ANOVA) with Fisher’s LSD test at 0.05 significance level. The non-parameter test method (Kruskal–Wallis test) was used for the same purpose because data of C4 still did not comply with the normal distribution. Additionally, for principal component analysis (PCA), the scores of four PARAFAC components were normalized to 0–1, and the Varimax rotation method was used. The PCA factors with explained variance lesser than 10 % were abandoned (Lapierre and Frenette 2009). The abovementioned analyses were performed on SPSS 20.0 (SPSS Inc., Chicago, IL).

Results

Optical properties of FDOM by traditional peak picking analysis

Typically, six EEMs from diverse sources (Fig. 2a–f) were chosen to demonstrate their distinctive optical features. According to the results reported by Coble (2007) and Fig. S1 shown in Supplementary Materials, three dominant fluorescence peaks appeared here were recognized and defined. That is, peak C indicates the typical humic-like materials at visible wavelengths with λ ex/em = 340/425 nm; peak A suggests the typical humic-like materials at UV wavelengths with λ ex/em = <250/420 nm, and peak T represents the typical tryptophan-like materials with λ ex/em = 280/340 nm. Consequently, the FImax of peak C in these water samples (Fig. 2a–f) was successively about 2.7, 15.9, 30.7, 24.1, 8.2, and 7.4 QSUs. Except for Liban Reservoir (2.7 QSUs), all the others were rich in peak C with FImax values ranging from 7.4 to 30.7 QSUs. Moreover, in terms of the value size of FImax, the similar results could be found for peak A or peak T (Fig. 2a–f). Therefore, Fig. 2 revealed the preliminary composition of FDOM and the content of three different fluorophores in various water samples. They were generally in the order of source pollutant (SP) > secondary tributary > initial rainwater (RW) > primary tributary > main stream > clean water (CW). Moreover, except for SP, RW, and CW, obvious tyrosine-like peaks (peak B) traditionally defined could be also included in EEMs of some river water samples (Fig. S2 in Supplementary Materials).

Fig. 2
figure 2

EEM contour plots of six typical water samples. a collected from the headwater of South Tiaoxi River (STR), b initial rainwater, c effluent of the treated rural domestic wastewater by septic tank coupled with biological contact oxidation reactor, d mainly polluted by fish breeding and domestic sewage, and e–f sampled from the downstream of rivers. Numbers (QSUs) in each subdiagram indicate the FImax of peaks A, C, and T traditionally defined

Determination of PARAFAC components and their fluorescence characteristics

The residual analysis and core consistency diagnostic (CORCONDIA) were carried out to determine the optimum component number. The sharp decrease for the sum of squared errors in Fig. S3 in Supplementary Materials and the useless improvements for residual errors appeared when the component number exceeded four, suggesting that three or four components were suitable for the PARAFAC model. Nevertheless, only the four-component model could be validated by split-half analysis and be fitted by random initialization. Additionally, the CORCONDIA analysis (Andersen and Bro 2003) for the multicomponent models (component number less than seven) again confirmed the reasonability of the four-component model. Consequently, the explained variance changed extremely little from 99.2 to 99.4 %, but the core consistency decreased from 75.2 to 23.3 % when the component number changed from four to five (Table 1). Actually, the PARAFAC models are misspecified if the core consistency values of them are less than 50 % (Bro 1998). Thus, four independent fluorescent components with distinct peculiarities were isolated. The spectral characteristic and split-half validation results for the isolated components are illustrated in Fig. 3. Besides, the similar fluorescent components from the other aquatic ecosystems are depicted in Table S1 and Fig. S1 (Supplementary Materials). The detailed description of the PARAFAC components herein was shown as follows.

Table 1 Explained variance and core consistency diagnostic for PARAFAC models with two to seven components in percentage terms
Fig. 3
figure 3

Fluorescent-independent components from the PARAFAC model. a–d contour plot of each component (top panel); e–h excitation and emission loadings for four components, based on the complete data set (lines) and two halves of the three-dimensional matrix (circles and crosses) by split-half analysis (bottom panel)

Component 1 (C1), the visible humic-like component (Stedmon and Markager 2005), had two peaks at λ ex/em = 255(360)/455 nm. C1 could be characterized as a mixture of peaks A and C traditionally defined (Coble 2007). Likewise, the spectral characteristic of C1 resembled the component 2 and anthrahydroquinone-2,6-disulfonate (AHDS, a reduced quinine-like materials) reported by Ishii and Boyer (2012). This highly humified component always had apparent large molecular weight (larger than 1000 Da), and it was derived from a vast range of aquatic systems such as the forest streams, agricultural rivers, wastewater, soil leachate, animal and plant debris, etc. (Ishii and Boyer 2012).

Component 2 (C2), the visible humic-like component like C1, exhibited peaks at λ ex/em = <250(320)/395 nm. Similar to the mixture of peaks A and M (traditionally defined as the visible marine humic-like component in ocean studies) (Coble 2007), C2 also resembled the component 3 and the anthraquinone-2,6-disulfonate (AQDS) summarized by Ishii and Boyer (2012). AQDS owns a blue-shifted emission peak versus AHDS (Cory and McKnight 2005). Generally, C2 also exists in a wide range of freshwater environments (Stedmon and Markager 2005; Stedmon et al. 2003). However, both its molecular structure conjugation degree and chemical stability were inferior to C1 (Guo et al. 2014).

Component 3 (C3) had a single peak at λ ex/em = 275/335 nm. This typical tryptophan-like component was similar to peak T traditionally reported (Coble 2007). The blue-shifted peak of C3 resembled the standard tryptophan solution, which exhibits peaks at λ ex/em = 275–280/352–356 and 225/343–358 nm (Mostofa et al. 2013). Moreover, some non-protein-like fluorophores including polyphenolic compounds in humic-like substances (e.g., lignin, gallic acid) exhibited the same peak T fluorescence (Hernes et al. 2009; Maie et al. 2007), which might be related to the existence of critical molecular structure like phenol or aniline (Li et al. 2013b). Overall, the tryptophan-like component is an effective indicator for pollution by human activities and biological metabolites by algae blooms (Coble 2007).

Component 4 (C4) exhibited a primary fluorescence peak at λ ex/em = <250/305 nm in the limited fluorescence scanning range. It could be characterized as a tyrosine-like fluorophore (peak B traditionally reported) (Coble 2007). This labile protein-like material was often autochthonously produced by plant and phytoplankton in rivers and lakes (Guo et al. 2014; Yao et al. 2011; Zhang et al. 2011). In addition, there was an attached peak at λ ex/em = <250/525 nm for C4, but it differed markedly from the typical humic-like materials because of the abnormal peak shape. This phenomenon should be due to the fluctuation of fluorescence signals or the inherent defects of the PARAFAC model.

Spatial distributions of PARAFAC components in multistage rivers

As shown in Fig. 4, except for C4, the average FImax of C1, C2, and C3 from the SPs (with 25.0 ± 2.0, 31.0 ± 2.5, and 25.8 ± 5.3 QSUs, respectively) had significantly higher levels (p = 0.000–0.043) than the others. The abundance of C1, C2, and C3 in SP was approximately 1.4 to 4.7 times higher than that of the river water, including ETR, NTR, MTR, and STR (p < 0.05). Additionally, the average FImax of C2 and C3 in RW was greater than CW (p < 0.0001), except for C1 and C4 without significant difference (p > 0.05). As a comparison, the average FImax of C2 and C3 exported from the headwater (CW) was significantly lower than the others (p < 0.012), and C1 also maintained at an extremely low level (p < 0.039, except for RW with p > 0.05). Additionally, the average FImax of C4 in CW, RW, NTR′, MTR, and even the SP showed no significant difference (p > 0.05) and was lower than that in STR′, STR, and ETR (p < 0.05, except for STR′ and RW with p > 0.05). For example, the FImax of four PARAFAC components in Liban Reservoir was only 2.8, 2.9, 1.0, and 0.0 QSUs, respectively.

Fig. 4
figure 4

Spatial variations of four PARAFAC components in diverse water samples. The boxes stand for the interquartile range (25th to 75th percentiles). The empty squares and lines in the boxes indicate the mean and median values, respectively. The vertical dashed lines divide the figure into four independent parts. The same lowercase letters denote no significant difference (p > 0.05), while different lowercase letters show significant difference (p < 0.05)

In particular, the FImax of C1, C2, and C3 in the secondary tributaries was greater than that in the corresponding primary tributaries including NTR, MTR, and STR (p = 0.000–0.034). Taking NTR as a representative, the average FImax of C1, C2, and C3 in the secondary tributaries (NTR′) was 18.2 ± 3.7, 21.3 ± 4.2, and 12.7 ± 3.4 QSUs, respectively. For these three components, the abundance of them in NTR′ was all larger than that in NTR (p < 0.05) but lower than that in SP (p < 0.05), respectively. However, the average FImax of C4 between the secondary tributaries and the related primary tributaries such as MTR′ and MTR demonstrated no significant difference (p > 0.05). Furthermore, the FImax of C1 and C2 in ETR was 12.2 ± 1.1 and 15.1 ± 1.4 QSUs, respectively. They both possessed slightly higher FImax values in ETR than those in the three primary tributaries (p < 0.05 for MTR and C2 of STR). In contrast, the FImax of C3 in ETR did not change (p > 0.05) and maintained steadily at a high level (approximately 7.0 QSUs) after collecting the three primary tributaries. Similarly, the content of C4 in ETR (12.5 ± 2.3 QSUs) was close to the multistage rivers (p > 0.05, except for NTR′ or MTR with p < 0.05). Additionally, except for C4, the MTR seemed to have lower content of C1, C2, and C3 (7.7 ± 0.9, 9.2 ± 1.0, and 5.5 ± 1.3 QSUs, respectively) than NTR (10.6 ± 0.9, 12.4 ± 1.1, and 6.7 ± 0.7 QSUs, respectively) and STR (9.2 ± 0.9, 11.3 ± 1.1, and 7.4 ± 1.0 QSUs, respectively) (not significant, p > 0.05). Therefore, NTR and STR might have somewhat more serious effects on ETR than MTR did. Liang et al. (2013) also stated that MTR was less polluted than NTR and STR due to the lower population density.

Composition of FDOM and two-step PCA for PARAFAC components

The comparisons of the average FImax values of four components and their relative abundance in various water samples (expressed as the percentage of each component in total using the FImax values) were performed. The results showed that the average relative contributions and FImax values of PARAFAC components were generally in the order of C2 (34.3 ± 0.8 %, 15.6 ± 0.9 QSUs) > C1 (27.9 ± 0.8 %, 12.6 ± 0.8 QSUs) > C3 (21.2 ± 1.0 %, 10.0 ± 1.0 QSUs) > C4 (16.6 ± 1.8 %, 8.8 ± 1.1 QSUs) (Fig. S4 in Supplementary Materials). Furthermore, the detailed description of varying contributions of humic-like (C1 and C2) and protein-like (C3 and C4) components for different kinds of water samples was revealed in Fig. S4 and Fig. S5 in Supplementary Materials.

Taken all samples together, exploratory PCA was firstly performed using the FImax of four components as a whole data set (Miller and McKnight 2010). The Kaiser-Meyer-Oklin value (KMO = 0.6 > 0.5) and Bartlett’s test of sphericity (p < 0.001) upheld the feasibility of PCA results (Guo et al. 2014). Only two principal components (factor A and factor B) with the highest eigenvalues (>1) were extracted. The factor A and factor B were responsible for the 61.5 and 25.6 % of the explained variance, respectively. The formulas of them expressed by four PARAFAC components were listed in Fig. 5a. Obviously, C1, C2, and C3 concurrently possess high factor A loadings (>0.5), while C4 with high scores of factor B (>0.5) dominated the PCA factor B. Furthermore, the second PCA using C1, C2, and C3 as the separate data set (without C4) was carried out. The analysis feasibility was also confirmed by the KMO value of 0.6 and the Bartlett’s test of sphericity (p < 0.001). Then, two secondary principal components named factor 1 and factor 2 were separated. They provided 99.2 % of the cumulative contribution and accounted for 81.9 and 17.3 % of the explained variance, respectively. Similarly, Fig. 5c demonstrates that C1 and C2 with high scores of factor 1 (>0.5) govern the PCA factor 1, but C3 with high scores of factor 2 (>0.5) plays a leading role in PCA factor 2. These results agreed extremely well with the findings represented by Guo et al. (2014) and Yao et al. (2015).

Fig. 5
figure 5

Component plots in rotated spaces (a, c) and property–property plots between the loadings of different PCA factors (b, d). Two subplots a and b were analyzed based on four PARAFAC components, while only C1, C2, and C3 were used in subplots c and d

In Fig. 5b, many river water samples (such as NTR′, NTR, and ETR) hold relatively higher scores of factor B (approximately from 1 to 2.3) than CW, RW, SP, and the other river samples (<1, negative). For factor A, both CW and RW demonstrate clumped distributions around 1 (negative), but SP is distributed dispersedly (0.5 to 3.2). Furthermore, Fig. 5d reveals that the most river samples (except for the secondary tributaries) cluster with relatively low scores of factor 2 and factor 1 (approximately −1.0–0.5 and −1.5–0.5). However, the distribution of the SP and secondary tributaries are highly decentralized. Additionally, RW and CW cluster together respectively (Fig. 5d), and they hold relatively high scores of factor 1 (>1, negative), whereas they have relatively low scores of factor 2 (<0.5, negative for CW and positive for RW).

Discussion

Preliminary interpretations of fluorescence spectra from different samples

FDOM as an indicator of water quality could give us meaningful organic matter information in water samples. Using the traditional peak picking method, the order of FImax values of peak C, A, and T for six typical water samples undoubtedly confirmed its feasibility (Fig. 2). However, similar to the FRI method, the fluorescence materials with multiple excitation peaks were artificially separated by the traditional peak picking method (Fig. 2) (Li et al. 2013b). Moreover, the fluorescence peaks could be seriously interfered by the region boundaries (Korak et al. 2014) or overlapped by the neighboring peaks such as the peak B and T (Fig. S6 in Supplementary Materials). Therefore, the EEM results (Fig. 2) could only raise the preliminary conclusion that the clean mountainous streams were gradually contaminated along the flow directions.

However, the underlying fluorescent species including four components in our study were successfully separated using the EEM-PARAFAC approach. Generally, the component peak at λ em > 380 nm is classified into the humic-like substance, whereas that at λ em < 380 nm is identified as the protein-like component (Ishii and Boyer 2012). Obviously, the fluorescent peaks of C3 and C4 (not including the attached abnormal peak) were imperfect. This phenomenon might be attributed to the limited excitation wavelength range (Yang et al. 2015; Zhang et al. 2014) or the lack of fluorophores with multiexcitation properties (Li et al. 2013b). In addition, the slight blue shift of the emission wavelength for C2 may be attributed to the electrostatic attraction effect caused by protein-like substances that exhibit lower emission peaks (Li et al. 2013b).

Although the municipal sewage sample (collected from the outlet of secondary sedimentation tank) with distinct EEM (Fig. S6 in Supplementary Materials) was removed by PARAFAC, it exhibited high protein-like (tyrosine-like or combinative tyrosine- and tryptophan-like) fluorescence similar to the other anthropogenic pollutants like rural domestic sewage (Goldman et al. 2012; Meng et al. 2013). The downstream river samples in our study also possessed remarkable and homologous signatures of protein-like fluorescence (Fig. 2e–f). Similarly, the effluent from the farm product processing facility also held high protein-like fluorescence (Fig. S6 in Supplementary Materials). Thus, the two kinds of effluents would be the great contributors to FDOM (especially for peaks T and B) in the receiving rivers if no advanced treatment such as the constructed wetland (Fig. S7 in Supplementary Materials) was applied.

Spatial heterogeneity of PARAFAC components in multistage rivers and other water samples

The FDOM is successively transported from the secondary tributaries to the primary tributaries (NTR, MTR, and STR) and finally into the mainstream of ETR in our study area. Interestingly, for the multistage rivers, the spatial variability of the first three PARAFAC components (C1, C2, and C3) and C4 (Fig. 4) was entirely different. The relatively low content of C4 for SP (Fig. 4) might be associated with the humification effect and instability of tyrosine-like component. For the rainwater, the contribution of C4 in FDOM (8.0 ± 4.1 %) was relatively lower than that of C2 (similar tyrosine-like component, 29.1 ± 8.8 %) reported by Zhang et al. (2014), indicating that the composition and abundance of C4 in different rainwater samples were different in Lake Taihu watershed.

However, there are obvious spatial variations of the PARAFAC components (C1, C2, and C3) in their migration pathway (Fig. 4). The FImax of C1, C2, and C3 reached maximum for SP and decreased followed by the secondary and then the primary tributaries. Thus, the quantity of the first three PARAFAC components in multistage rivers was on decline along the rivers’ transport. Although the detailed and credible biogeochemical processes of FDOM in multistage rivers need further studies, to understand the results of C1, C2, and C3 in Fig. 4, several possible explanations were proposed here for reference only.

The decrease might be mainly attributed to the self-purification function within the aquatic ecosystems such as the effects of dilution (Chen et al. 2013), photochemical transformation, photodegradation (Meng et al. 2013; Rodriguez-Zuniga et al. 2008), microbial decomposition, etc. (Stutter et al. 2013). Firstly, compared with SP, the dilution effect of the upstream clean water can greatly reduce the FDOM fluorescence in the secondary or primary tributaries (Fig. 1). Secondly, the decrease of C1, C2, and C3 in the tributaries revealed that the fluorescent organic matter might be decomposed gradually along the multistage rivers especially in the secondary tributaries. For instance, the humic-like component C2 could be easily photobleached (Zhang et al. 2013), and the tryptophan-like material C3 would suffer from the indirect photochemical reactions promoted by humic-like substances (Xu and Jiang 2013). Thus, the bioavailability, microbial respiration, and degradation rate for DOM could be greatly strengthened (Chen and Jaffé 2014; Sulzberger and Durisch-Kaiser 2009). Even so, Stutter et al. (2013) argued that 5–30 % natural DOM exported from headwaters could be utilized and decomposed by microbial metabolism. Therefore, the photobleaching and biological degradation effects should greatly influence the fate of FDOM in multistage rivers. Thirdly, there has been a consistent accumulation trend for both C1 and C2 in ETR (Fig. 4). These recalcitrant and terrigenous components might be mainly coming from the humified FDOM in primary tributaries and partly from the riverine soil leachate and other diffuse sources. This phenomenon and the similar result of Chen et al. (2013) together implied that the content of C1 and C2 in ETR might decrease to reach the lowest limit.

Significantly, C4 in most of the CW samples was not detected (Fig. S4). However, the first three PARAFAC components in CW still existed in minute quantity even for the samples exposed to the weakest anthropogenic influences (Fig. S4 in Supplementary Materials). These discoveries were in close accord with the results revealed by Chen et al. (2013), and the obvious protein-like material (peak T) also existed in the clean water sample of reservoir (Wei et al. 2012). The existence of such low amounts of FDOM in CW could be attributed to the inputs of the terrestrial organic matter and the autochthonous metabolites produced by algae, bacteria, and phytoplankton (Cammack et al. 2004). Actually, except for C4 (Fig. 4), the first three components in CW samples almost had the lowest content than the other samples. Hence, the increase magnitude of the content of FDOM (C1, C2, and C3) versus the background level of CW could act as the promising estimate of water quality for river pollution diagnosis (Coble 2007; Zhang et al. 2011). Thus, the observed consequences in Fig. 4 further demonstrated that the mountainous rivers’ water quality was deteriorated along the flow directions.

Correlations, sources, and homogenization process of PARAFAC components

The sources and correlations of four PARAFAC components could be revealed by two-step PCA effectively. Here, the most possible mechanisms for such PCA results (Fig. 5a, b) were listed as follows. Firstly, associated with the completely different spatial distribution results of C4 in multistage rivers (Fig. 4), Fig. 5a, b further illustrate that C4 as the tyrosine-like component should have the entirely different sources from C1, C2, and C3. Moreover, the extremely poor results of the correlation (p > 0.1) and linearization regression analyses (R 2 < 0.1) between C4 and the first three components again confirmed this speculation (Fig. S8 in Supplementary Materials). Actually, C4 might be mainly formed in situ through the biological activity in multistage rivers. Guo et al. (2014) and Zhang et al. (2011) also pointed that this autochthonous tyrosine-like component probably represented the semi-labile FDOM, which was freshly produced by plants and phytoplankton. Consequently, many river water samples such as ETR, NTR′, and NTR possessed high scores of PCA factor B (Fig. 5b). Figure 4 also demonstrates that there is no significant difference on the content of C4 in the secondary tributaries and the following primary tributaries. Furthermore, in the downstream of the southwest mountainous rivers including ETR around Lake Taihu (Fig. 1), Yao et al. (2011) claimed that the two tryptophan-like (similar to C3) and two humic-like materials (similar to C1 and C2) all possessed obviously lower FImax values than the tyrosine-like component (similar to C4). Additionally, the contributions of C4 from SP were extremely low as well as RW and CW (p > 0.05, Fig. S4 in Supplementary Materials). This result suggested that the extraneous input of C4 by SP or RW was limited and the biological activity in CW was weaker than that in the multistage rivers. Secondly, C1, C2, and C3 in multistage rivers might be mainly derived from the allochthonous sources (Fig. 5a). As a result, the spatial variations of C1, C2, and C3 in their migration pathway were extremely similar (Fig. 4). Moreover, these three components were strongly correlated (p < 0.001, Fig. 6). While C3 could be generated in situ during the phytoplankton growth, this tryptophan-like component in rivers should mainly come from the terrestrial domestic sewage and rainfall in our study (Mladenov et al. 2011; Zhang et al. 2014). Yao et al. (2011) also claimed that the contribution of tryptophan-like component from the autochthonous sources in rivers was much lower than that in Lake Taihu because the biological production rate for the latter was greatly enhanced.

Fig. 6
figure 6

Correlation and linear regression analyses between C1, C2, and C3. The figure legend of a is suitable for both b and c

Moreover, only the C1 and C2 in Fig. 5c tended to cluster together (Guo et al. 2014), which indicated that they should have the similar sources and removal processes (Yao et al. 2015). The two dominant humic-like components (C1 and C2, Fig. 3, Fig. S5 in Supplementary Materials) were widely recorded in previous literatures (Ishii and Boyer 2012). They always simultaneously coexisted and covaried in natural waters (Yao et al. 2015), and the linear fitting equation (Fig. 6) again indicated their strong co-variation relationship (Zhang et al. 2011). On the other hand, these typical humic-like components including C1 (peak A + C) and C2 (peak A + M), largely affected by terrestrial input (Phong et al. 2014; Zhang et al. 2011), often appeared in forest streams and agricultural rivers (Ishii and Boyer 2012; Stedmon and Markager 2005). Here, the multistage rivers mostly originate from the Tianmushan Mountains (Fig. 1) and mainly suffered from the diffuse agricultural pollutants and domestic sewage. Therefore, C1 and C2 should mainly represent the diffuse humic substances spreading throughout the whole catchment. Surely, the relative contributions of their natural and anthropogenic sources require further studies. On the contrary, the poor results of linear regression analyses between C3 and C1 or C2 (Fig. 6) further implied that C3 should have the different sources versus C1 and C2. As stated in the front paragraph, C3 should mostly derive from the anthropogenic inputs like rural domestic wastewater and partly from the autochthonous metabolites due to biological activities (Coble 2007; Hudson et al. 2007).

Furthermore, for the allochthonous FDOM (C1, C2, and C3), Figs. 5d and 6 effectively uncover the significant discrepancy of the first three component quantities between diverse samples. On the other hand, the quality of C1, C2, and C3 for various samples (expressed as their relative abundance in total) could then refer to Fig. S5 (Supplementary Materials). Generally, the neighboring points always resemble each other because of their similar fluorescence spectral patterns, concentrations, and composition. As shown in Figs. 5d and 6 and Fig. S5 (Supplementary Materials), the rivers’ pollution processes were vividly revealed. Many river samples including CW were closely clustered together and held weak negative scores of two PCA factors. Thus, the concentrations of C1, C2, and C3 in rivers were supposed to be very low (Fig. 5d), but the humic-like components (C1 and C2) possessed the dominant contributions to FDOM (Fig. S5 in Supplementary Materials). On the contrary, the pollution sources with the highest degree of heterogeneity were distributed more scattered than the others did (Fig. 5d). In addition, the relative abundance of C3 to the first three components in RW is up to 40–45 % (Fig. S5 in Supplementary Materials). Actually, the rainwater always had the distinctive tryptophan-like peak in EEMs (Mladenov et al. 2011; Zhang et al. 2014), which offered a reasonable explanation for the result that RW samples obtained the positive scores of PCA factor 2 (<0.5) (Fig. 5d).

Above all, the secondary tributaries being polluted firstly seemingly get together gradually in Fig. 5d and move toward the next receiving rivers. The change of relative abundance of C1, C2, and C3 reflected the same tendency as well (Fig. S5 in Supplementary Materials). Accordingly, the following ETR and its primary tributaries closely clustered together, and the contributions of C3 (15–25 %) to the allochthonous FDOM in abovementioned rivers also decreased to the relatively stable levels (Fig. 5d and Fig. S5 in Supplementary Materials). Therefore, both the humification and decomposition processes promoted the formation of a stable mixture of FDOM, and the secondary tributaries played a key role in such homogenization process. Similar to Fig. 5d, Yao et al. (2011) reported that the water samples from Lake Taihu were distributed more clustered than all the accessory rivers including ETR. Therefore, it could be speculated that the allochthonous FDOM from diverse sources in our study area was gradually homogenized along the migration directions. One more thought, this process would occur in the other accessory rivers, which are finally mixed together around Lake Taihu.

Implications of the secondary tributary remediation for watershed management

Actually, the secondary tributaries always serve as the important “blood capillary” channels in vivid metaphor to collect the diverse pollutants such as organic matter in the catchment scale. Namely, the combined organic contaminants with higher FDOM concentrations (C1, C2, and C3 in Fig. 4) would pollute the secondary tributaries directly. Consequently, the FDOM content (represented by C1, C2, and C3) in secondary tributaries, next only to the source pollutants, had significantly higher levels than that in subsequent receiving rivers (Fig. 4). However, compared with the mainstream of ETR, the ecological deterioration process of the secondary tributaries would be aggravated more easily because of their lower environment capacity and weaker self-purification ability for FDOM. Moreover, the adverse hydrological conditions such as the lower water flow velocity and the larger backwater areas in eutrophic tributaries further accelerated this process. Additionally, the rainwater (Fig. 4) and the following earlier surface runoff always contained high FDOM especially in Lake Taihu basin (Fig. 1) due to the air pollution (Carstea et al. 2010; Zhang et al. 2014). Therefore, the above unfavorable factors could directly threaten the health of the original ecological systems in secondary tributaries.

To improve the water quality of river networks, the inputs of numerous government funds in our studying watershed (Fig. 1) would continue to increase. As a promising estimate of water quality, the spatial distribution of FDOM in multistage rivers could provide us with novel insights into the watershed protection works (Guo et al. 2014). In fact, the productive control measurements for combined pollution sources and ecological remediation for the secondary tributaries might be the most technically feasible and economic promotion strategies for our goals. Thus, to relieve the organic matter loading in rivers and lakes such as the Lake Taihu watershed, the secondary tributary remediation in watershed management cannot be ignored and should be paid more attentions by local governments in future works. This suggestion could provide the decision-makers with realistic ecological restoration strategy for the generation of clean freshwater runoff in rivers.

Conclusions

The work focused on the utility of EEM-PARAFAC to track the fate of FDOM in multistage rivers in a typical combined polluted subcatchment. As a result, two dominant terrestrial humic-like (C1 and C2) and two protein-like (C3 and C4) components were identified. The average FImax of the first three components reached maximum in pollution sources and decreased followed by the secondary and then primary tributaries. However, except for C3 and C4, both C1 and C2 in the mainstream of ETR accumulated gradually. Furthermore, PCA effectively identified the origins and discrepancy of FDOM in multistage rivers, rainwater, and source pollutants. Besides, the migration of allochthonous FDOM would gradually push forward its homogenization process in the flow directions. Thus, FDOM could act as an effective tracer to explore the organic matter characteristic and the organic pollution levels in diverse water samples.

Frankly, FDOM does not contain the non-fluorescent organic matter. Thus, the complete interpretation of DOM quality and quantity in water body needs further studies. However, the high enrichment of the allochthonous FDOM (C1, C2, and C3) in the secondary tributaries provided us with a new recognition. As shown in Fig. S9 (Graphical Abstract in Supplementary Materials), although the further empirical substantiation is required, the supervision and management for the secondary tributaries would play the prominent role in watershed remediation works in the future.