Introduction

Chromophoric dissolved organic matter (CDOM) is the light-absorbing fraction of DOM found in all natural water bodies. It consists of a heterogeneous mixture of aliphatic and aromatic polymers from terrestrial substances composed of humic and fulvic acids and aquatic humic-like substances of phytoplankton origin (Kirk, 1994; Mostofa et al. 2013; Zhang et al., 2007). Approximately 40–60 % of dissolved organic carbon (DOC) is composed of humic-like substances in natural waters (Senesi, 1993), and therefore, CDOM, sometimes reported as DOC (Morel and Gentili, 1993; Song et al., 2015), is dominated by humic-like components. CDOM modifies the optical properties of waters and affects the balance and availability of dissolved nutrients and metals in water bodies (Findlay and Sinsabaugh, 2003; Mostofa et al. 2013; Song et al., 2013); thus, it can be used as a water quality proxy. CDOM plays an array of important roles in aquatic systems, but high concentrations of organic matters are considered pollutants (Siepak, 1999). Although the spatiotemporal variations in CDOM in coastal and open oceans as well as inland aquatic ecosystems have been widely investigated (Shank et al., 2005; Song et al., 2015; Zhang et al., 2010), variations in CDOM in urban water, which is controlled by soil type, land use, precipitation, and sewage discharge, have not been fully examined (Vasel and Praet 2002; Weserhoff and Anning 2000; Wu et al., 2006, 2011).

Urban water bodies, including lakes in parks, rivers flowing through cities, and reservoirs positively affect people living in the urban communities. These water bodies beautify the environment and clean the air, i.e., public spaces along rivers and lakes offer opportunities for community gatherings and recreational activities. Water flowing into urban water bodies is mainly discharged underground, as tap water, rainfall, and sewage effluents (Carstea et al., 2009; Wu et al., 2006). Large quantities of pollutants from various sources, such as residential/commercial wastewater, industrial discharge, automobiles, trash, and polluted storm water runoff from urban landscapes, are discharged into urban water bodies due to rapid urbanization and industrial development. Therefore, urban surface water quality has been severely deteriorated, resulting in variations in CDOM concentrations and composition.

Traditional methods, e.g., chemical oxygen demand (COD), biological oxygen demand (BOD), and total organic carbon (TOC), are used to evaluate organic pollutants in natural and wastewater, but laboratory analysis are time consuming and labor intensive. COD and BOD may not accurately detect the concentrations of organic pollutants in water, whereas TOC does not provide detailed source information for compositionally complex CDOM pollutants (Teymouri, 2007; Guo et al., 2010). Ahmad and Reynolds (1995) proposed that fluorescence spectroscopy could be used to determine wastewater quality, and this method has been used as an effective tool to control sewage treatment and natural water quality (Ahmad and Reynolds, 1999; Baker 2001; Hudson et al., 2007). As some optically active components in CDOM can re-emit fluorescence after being excited at certain wavelengths (Zhang et al., 2010), fluorescence spectroscopy serves as surrogate to trace the relative contribution percentages of different fluorescent fractions. To date, excitation-emission matrix spectroscopy (EEM) is a simple but effective technique to identify CDOM components due to its high selectivity and sensitivity for CDOM in waters (Zhang et al., 2010). The fluorescence fingerprint has also been applied to retrieve pollution information on CDOM sources in urban water bodies (Wu et al., 2006).

The EEM approach has been used to distinguish the originating source of DOM between natural waters and rivers impacted by anthropogenic activities (Baker, 2001; Baker et al. 2003). CDOM composed of two key fluorescent fractions (i.e., humic- and protein-like substances) according to the traditional “peak-picking” method (Coble, 1996; Hudson et al., 2007; Stedmon et al., 2003). Humic acids are terrestrial and phytoplankton originating substances. The protein-like components (i.e., tyrosine-like (B) and tryptophan-like (T) substances) consist of dissolved amino acids that originate from microbial reworking of humic-like substance and the degradation products of once-living aquatic organism remnants (Mostofa et al. 2013; Teymouri, 2007; Thomas, 1997). Anthropogenic contaminations may result in high contribution percentages of tryptophan-like fractions (Baker, 2001; Bridgeman et al., 2013). However, the traditional peak-picking method has been reported unreliable for evaluating CDOM dynamics in aquatic ecosystems because of overlapping CDOM EEM fluorophores (Coble, 1996; Stedmon et al., 2003). EEM coupled with parallel factor analysis (PARAFAC) has been shown as an effective technique to decompose excitation-emission matrix spectra of CDOM into independent fluorescent components (Borisover et al., 2009; Guo et al., 2010; Hua et al., 2007; Wang et al., 2014; Zhang et al., 2010, 2011, 2013). Stedmon et al. (2003) introduced PARAFAC to decompose excitation-emission matrix spectra into five distinct DOM components in a Danish estuary and its catchment. Hua et al. (2007) reported four components derived from water samples collected from landfills, wastewater treatment plants, lakes, and rivers using PARAFAC modeling. Guo et al. (2010) found five components in CDOM samples collected during different sewage treatment processing stages using the combined EEM-PARAFAC technique.

In this study, EEM-PARAFAC was applied to analyze CDOM components in polluted urban waters sampled from cities in Northern and Northeastern China. Five field campaigns were conducted in cities between 29 July and 26 August 2013 in Northeastern China (Shenyang, Changchun, and Harbin), and between 15 and 16 July 2014 in northern China (Beijing and Tianjin). The specific objectives of this study were to (1) characterize the CDOM components in urban waters using EEM and identify their origins using the EEM-PARAFAC method; (2) assess the dynamics of individual fluorescent CDOM components with respect to spatial variation and pollution levels; and (3) identify the relationships between PARAFAC components and DOC, total nitrogen (TN), ammonium-N (NH4-N), and chemical oxygen demand (CODMn), which are the major pollution proxies traditionally used to assess water quality.

Materials and methods

Study area

The study area consisted of three provincial capital cities (Shenyang, Changchun, and Harbin) in Northeastern China and two municipalities (Beijing and Tianjin) in the Northern China (Fig. 1). Shenyang is a sub-provincial city and the capital of Liaoning Province in northeastern China, located on the north bank of the Hunhe River. It is an important industrial center and is the transportation and commercial center of China’s northeastern region. Changchun is located in Jilin Province and is dominated by heavy industries. Many lakes are located in parks or gardens, and the Yitong River runs through the city. Harbin is the capital of Heilongjiang Province and is situated in a productive farming region. Harbin serves as a key political, economic, scientific, cultural, and communication hub in Northeastern China. The Songhua River flows through the center of the city. The urban areas occupy approximately 399 km2 with 4.60 million residents, 222 km2 with 3.50 million residents, and 318 km2 with about 3.65 million people for Shenyang, Changchun, and Harbin, respectively. The surface water area occupies a total of 1.50, 4.90, and 8.05 % for Shenyang, Changchun, and Harbin, respectively. The annual rainfall is concentrated from June to September, and the annual precipitations are about 760, 567, and 569.1 mm for Shenyang, Changchun, and Harbin, respectively.

Fig. 1
figure 1

Locations of the urban water sampling sites in five Northeastern and Northern China cities. a Shenyang (S1–16), b Changchun (C1–28), and c Harbin (H1–22) in Northeastern China and d Beijing (B1–6) and e Tianjin (T1–12) in Northern China

Beijing is the capital of China and is one of the most populated cities in the world. The city is the political, cultural, economic, and educational center of China. The urban water bodies are primarily composed of lakes in parks across the city. Tianjin, one of China’s four municipalities, is 137 km from Beijing and is one of the earliest coastal cities to open up Northern China’s shipping and industrial center. The Haihe River flows through the city and connects with the Yellow River and the Yangtze River through the Grand Canal. The two municipalities cover area of approximately 770 km2 with 21.70 million residents and 605 km2 with 15.46 million residents for Beijing and Tianjin, respectively. The surface water area occupies a total of 0.56 and 1.72 % for Beijing and Tianjin, respectively. Beijing and Tianjin are located in the North China Plain and the two cities have a mean precipitation of 600 mm, and around 80 % of rainfalls take place in June–August.

Field campaigns

A total of 84 water samples were collected from lakes in parks or gardens, rivers, reservoirs, and industrial and domestic wastewater pools in the five cities. Water samples were collected from 29 July to 26 August 2013 in the cities of Northeastern China and on 12–13 July 2014 in Beijing and Tianjin from the surface of the waters into 1-L, acid-cleaned plastic bottles and held on ice (Fig. 1). The samples were transported to the laboratory within 10 h, filtered, and stored at 4 °C until analysis within 2 days. Latitude and longitude were recorded in situ using a GPS receiver.

Analysis of water quality parameters

TN was measured based on absorption of water samples decomposed with alkaline potassium peroxydisulfate at a wavelength of 146 nm. Total phosphorous (TP) was determined using the molybdenum blue method after the samples were digested with potassium peroxydisulfate (APHA/AWWA/WEF, 1998). Chlorophyll a (Chl-a) was extracted from 0.45-μm pre-filtered samples with a 90 % acetone solution and determined using a Shimadzu UV-2600PC UV-Vis dual beam spectrophotometer. COD and NH4-N were measured by titration with acidic potassium permanganate, based on the procedures for Monitoring and Analytical Method of Water and Waste Water (State Environment Protection Administration of China, 2002). The DOC concentration was determined in 0.45-μm pre-filtered samples using a Shimadzu TOC-5000 Analyzer and a 1.2 % Pt silica catalyst at 680 °C. Potassium hydrogen phthalate was used as a standard. Reproducibility of the analytical procedure was 2–3 % for the current study.

Absorption measurements

All samples were filtered at low pressure through a pre-combusted Whatman GF/F filter (0.7 μm) and then through a pre-rinsed 25-mm Millipore cellulose filter (0.22 μm). All filtrates were kept in glass bottles. Absorption spectra of the samples were measured between 200 and 800 nm at 1-nm increments using a Shimadzu UV-2600PC UV-Vis dual beam spectrophotometer with a 1 cm (or 5 cm) quartz cuvette and Milli-Q water as a reference. The CDOM absorption coefficient was calculated from the sample optical density (OD) value measured using Eq. (1):

$$ {a}_{\mathrm{CDOM}}\left(\lambda \right)=2.303\;\left[{\mathrm{OD}}_{S\left(\lambda \right)}-{\mathrm{OD}}_{\left(\mathrm{null}\right)}\right]/\gamma $$
(1)

where γ is cuvette path length (0.01 m) and 2.303 was used to convert from base 10 to base natural logarithm for transformation. Some fine particles may have remained in the filtered solution (Babin et al., 2003; Bricaud et al., 1995); therefore, we corrected for fine particle scattering; OD(null) was the mean OD at 740–750 nm, where the absorbance of CDOM is assumed to be zero.

The CDOM absorption spectrum (a CDOM(λ)) was expressed as an exponential function (Babin et al., 2003; Bricaud et al., 1995):

$$ {a}_{\mathrm{CDOM}}\left({\lambda}_i\right)={a}_{\mathrm{CDOM}}\left({\lambda}_r\right) \exp \left[-S\left({\lambda}_i-{\lambda}_r\right)\right] $$
(2)

where a CDOM (λi) is CDOM absorption at a given wavelength λi, a CDOM(λr) is the absorption estimate at the reference wavelength of λr (440 nm), and S is the spectral slope of CDOM absorption. S R  = S1/S2 is the ratio of the 275–295 nm (S1) and 350–400 nm (S2) wavelength ranges (Helms et al., 2008; Zhang et al., 2010).

Three-dimensional fluorescence measurements

Excitation-emission matrix fluorescence spectra of the CDOM were measured using a Hitachi F-7000 fluorescence spectrometer (Hitachi High-Technologies, Tokyo, Japan) with a 700-voltage xenon lamp at scanning ranges of 200–450 nm for excitation and 250–500 nm for emission. Readings were collected in ratio mode at 5-nm intervals for excitation and at 1-nm intervals for emission at a scanning speed of 2400 nm min−1. The band-passes were 5 nm for both excitation and emission. A Milli-Q water blank of the excitation-emission matrix fluorescence spectra was subtracted to eliminate water Raman scatter peaks (McKnight et al., 2001; Stedmon et al., 2003; Zhang et al., 2010, 2011).

The excitation-emission matrix fluorescence spectra were corrected for absorbance to eliminate the inner-filter effect by multiplying each value by a correction factor based on the assumption that mean absorption path length of excited and emitted light is one half the cuvette length (McKnight et al., 2001; Zhang et al., 2010). Finally, the fluorescence intensities in all sample excitation-emission matrix fluorescence spectra were normalized to the area under the Milli-Q water Raman peak (λex = 350 nm and λem = 371–428 nm) measured daily (Lawaetz and Stedmon, 2009). The contour figures of the excitation-emission matrix fluorescence spectra were plotted using Matlab 10.0 software (Math Works, Natick MA, USA).

PARAFAC modeling

A three-way PARAFAC was applied to decompose the excitation-emission matrix fluorescence spectra of the complex CDOM mixture into individual fluorescent groups. A detailed description of the PARAFAC methodologies could be found in Stedmon and Bro (2008). A split half analysis was included in the PARAFAC model to validate the model results (Stedmon and Bro, 2008). Fluorescence intensities of all components were represented by the F max value (Raman unit: nm−1) (Stedmon and Markager, 2005).

Statistical analyses

The statistical analyses were conducted using the SPSS 16.0 software package (SPSS Inc., Chicago, IL, USA). Principal components analysis (PCA) on the three PARAFAC components, CDOM absorption coefficients a(355), the slope ratio S R and DOC concentrations, was conducted to identify the relative distribution of 84 water samples. Cluster analysis (CA) based on the three PARAFAC components, was used to group 28 water samples into three clusters of similar pollution levels within the Changchun city. Regression and correlation analyses were used to calculate the relationships between CDOM absorption coefficients, DOC concentrations, TN, NH4-N, CODMn, and F max for the humic-like (C1) and tryptophan-like (C2) components, respectively. Differences in parameters were assessed with an independent samples t test, where a p value ≤0.05 was considered as significant.

Results and discussion

Water quality conditions

The TN, TP, and Chl-a concentrations for the 84 water samples are displayed in Table 1. Nitrates and phosphate are used to imply eutrophic status in urban waters. In general, the water samples revealed high TN (5.88 ± 7.53 mg L−1), TP (0.34 ± 0.50 mg L−1), and Chl-a concentrations (47.91 ± 61.94 mg L−1), indicating that urban waters in Northern and Northeastern China during the summers of 2013 and 2014 were severely polluted with TN and TP, resulting in algal blooms.

Table 1 Mean water quality parameter values for urban water bodies in Northern and Northeastern China

DOC concentrations

As shown in Fig. 2 and Table 2, the distributions of DOC concentrations in the samples from the five Chinese cities varied from one water body to another as well as from one study area to another. The DOC concentrations ranged from 4.36 (in Tianjin) to 138.6 mg L−1 (in Harbin), with a mean value of 18.10 ± 19.18 mg L−1 when all the samples were pooled together (Fig. 2a). In the North China Plain, the soil type is dominated by loess, while in Northeastern China, the soil type is dominated by humic-rich black soil (Song et al., 2013). This would result in a difference in background values of DOC concentrations in water from humic substances in soil between North and Northeast of China. The urban water samples from the three cities (i.e., Shenyang, Changchun, and Harbin) in Northeastern China were collected from landscape waters to sewage. With respect to the water samples from the two municipalities (i.e., Beijing and Tianjin) in North China Plain, they only collected from landscape waters. Therefore, the water samples collected from these five cities were divided into three groups: group 1 includes the landscape waters from Northern China (i.e., Beijing and Tianjin); group 2 is made up of the landscape waters from Northeastern China; and group 3 consists of the polluted waters (i.e., sewage or effluent) from Northeastern China (i.e., Shenyang, Changchun, and Harbin).

Fig. 2
figure 2

a Dissolved organic carbon (DOC) concentrations for water samples collected from different cities of Northern and Northeastern China. (SY Shenyang, CC Changchun, and HB Harbin in Northeastern China and BJ Beijing and TJ Tianjin in Northern China). Box plots of b DOC, c a(355), and d S R. Group 1 the landscape waters from North China, Group 2 the landscape waters from Northeast China, Group 3 the polluted waters from Northeast China

Table 2 Mean dissolved organic carbon (DOC), chemical oxygen demand (CODMn), and ammonia-N (NH4-N) concentrations, and chromophoric dissolved organic matter (CDOM) absorption coefficient groups in different cities of Northern and Northeastern China

There is a significant difference in the mean DOC concentrations in landscape waters between North (in group 1) and Northeast (in group 2) China (t test, p < 0.001) (Fig. 2b). The DOC concentrations in group 2 ranged from 6.16 (in Changchun) to 17.47 mg L−1 (in Harbin) in Northeastern China, with a mean of 13.08 ± 2.44 mg L−1 when the landscape water samples from Northeastern China were pooled together (Fig. 2a, b). Water samples in group 1 were only collected from landscape waters in Northern China. When the set of water samples from the two cities was pooled, DOC concentrations ranged from 4.36 mg L−1 in a sample from Haihe in Tianjin to 10.71 mg L−1 in a sample from Beijing, with a mean value of 5.79 ± 1.83 mg L−1. These DOC concentrations were much lower than the landscape water results (10–20 mg L−1) in Northeastern China, even though the water was severely polluted (Fig. 2a). This result may partially be explained by the differences in soil types as loess dominating in North China Plain and black soil in Northeastern China. The effects of soil type on variations in stream DOC concentrations have been reported (Aitkenhead et al., 1999; Dawson et al., 2001), which may provide a background value of DOC concentrations in water for different regions.

There is also a significant difference in the average DOC concentrations between the landscape (in group 2) and polluted waters (in group 3) in Northeastern China (t test, p < 0.001) (Fig. 2b). The DOC concentrations ranged from 6.16 (in Changchun) to 138.6 mg L−1 (in Harbin) in Northeast China, with a mean of 21.46 ± 19.18 mg L−1 when all water samples collected from Northeastern China were pooled together. The DOC concentrations in landscape waters from this study region (in group 2) were 10–20 mg L−1 (Fig. 2a, b). However, the DOC concentrations in severely polluted waters by sewage from anthropogenic sources (in group 3) exhibited much higher values (>20 mg L−1 at sampling sites H1–6 and C8–11) up to about 140 mg L−1 (H19) at a waste discharge point in Harbin.

CDOM characteristics

Generally, absorption coefficient a(355) is used as a proxy to characterize CDOM concentration (Zhang et al., 2007) and a(254) is used for the optical properties of DOC aromaticity (Weishaar et al., 2003). As shown in Table 2, the absorption coefficients a(254) and a(355) ranged 10.51–107.55 and 0.92–22.24 m−1, with means of 31.36 ± 20.80 and 6.00 ± 4.34 m−1, respectively. The relative distribution of a(355) is similar to that of DOC concentrations from groups 1–3 in Northern and Northeastern China (Fig. 2c). The mean a(355) for water samples in groups 1 (landscape waters in North), 2 (landscape waters in Northeast), and 3 (polluted waters in Northeast) were 2.96 ± 1.16, 4.64 ± 2.08, and 11.87 ± 4.63 m−1 (Fig. 2c), respectively, and these values were significantly different from each other (t test, p < 0.001). The spectral ratio (S R) of the two wavelength ranges (275–295:350–400 nm) was used to represent DOM molecular weight (Helms et al., 2008). The mean S R decreased from 1.40 ± 0.11 to 1.17 ± 0.66, and further to 0.90 ± 0.14 for groups 1–3, respectively. Among the three groups, the highest average S R (∼1.40) was presented in landscape waters from Northern China, indicating lower average molecular weight of DOC in these water samples; the lowest average S R (∼1.00) suggested higher average molecular weight of DOC in the polluted water samples from Northeastern China (Fig. 2d). The difference in S R is likely because spatial variations exert influence on microbial activities from the decay of plant as well as sewage and the terrestrially imported substances from soil in different regions from North to Northeast in China.

Characterization of the CDOM EEM fluorescence spectra

The humic-like peaks (A and C), tryptophan-like peak (T), and tyrosine-like peak (B) were detected in the 84 urban water samples (Coble, 1996; Stedmon et al., 2003). Fluorescence intensity depends on the concentrations of the main fluorophores dissolved in the waters. Five types of urban water bodies were recorded for the samples based on the intensities and locations of the marked fluorescence peaks (Fig. 3 and Table 2). Type I water was mainly from underground and tap water. CDOM in type II water is mainly originated from terrestrial sources or plant decay. Most of the Harbin water samples showed characteristic type II peaks. Type III water is a mixture of freshly supplied underground water and polluted water from industrial and household sources. Type III water was detected from rivers flowing through cities, such as the Yitong River in Changchun and the Hunhe River in Shenyang. The tryptophan-like peak (T) was relatively high and predominant in type IV water samples. Water samples from lakes in parks displayed fluorescence peaks characteristic of treated polluted or untreated wastewater effluents from the city, e.g., Nanhu and Beihu Lakes in Changchun. Type V water are typically substances from sewage water. Two humic-like peaks and the tyrosine-like peak were obscured by the distinctively high tryptophan-like (T) fluorescence intensity in type V water. The distinctly high protein-like intensity in wastewater and sewage is explained by the difference in CDOM quality between natural and sewage water, which is consistent with the findings reported by Ma et al. (2001) and Baker (2001).

Fig. 3
figure 3

Five fluorescence spectrum types in urban waters sampled from cities of Northern and Northeastern China (Raman: nm−1). a Type I, mainly from underground and tap waters; b Type II, derived from terrestrial sources or plant decay; c Type III, freshly supplied underground water and polluted sewage water from industrial and household sources; d Type IV, lakes in parks containing treated polluted or untreated wastewater effluent; and e Type V, typically derived from substances in sewage water

To determine the appropriate number of PARAFAC components, the split half validation procedure was executed to verify whether the model was valid by comparing the emission and excitation loadings from each half (Stedmon and Bro, 2008). In split half analysis, the 84 EEM were randomly divided into four halves and then analyzed for two different splits (1–2 and 3–4 half split). A three-component model was well-validated by the split half validation procedure and was found to adequately describe the whole EEM dataset (Fig. 4).

Fig. 4
figure 4

Results from split half analysis (1–2 up; 3–4 down) in PARAFAC modeling. The plots represent spectral shapes of the excitation and emission loadings from the two halves (1–2; 3–4 split half analysis) modeling

The three individual fluorescent components (Fig. 5a–c) and the excitation and emission loadings (Fig. 5d–f) of the three components identified by EEM-PARAFAC are summarized in Fig. 5 and Table 3. The first fluorescent component (C1) was a humic-like substance with two excitation maxima (230 and 330 nm) at a single emission wavelength (442 nm), which was consistent with the humic-like peaks (A and C) defined by Coble (1996). Component 2 demonstrated two excitation maxima (220 and 285 nm) and one emission maximum (348 nm), which is similar to tryptophan-like substances from anthropogenic sewage CDOM inputs. The humic-like C1 and tryptophan-like C2 components are similar with the results of components derived from PARAFAC modeling different waters (Teymouri, 2007). Component 3 is likely a tyrosine-like substance, which was characterized by maximal excitation at 270 nm and emission at 300 nm. Components 2 and 3 represent autochthonous semi-labile CDOM associated with bacterial activities or phytoplankton decomposition (Stedmon et al., 2003).

Fig. 5
figure 5

The parallel factor analysis (PARAFAC) modeling output shows the contour plots of the three PARAFAC fluorescent components (ac) and excitation (black) and emission (red) loadings (df) of each component. (Raman: nm−1)

Table 3 Positions of the maximum fluorescence peaks of the three components identified by parallel factor analysis (PARAFAC) modeling in the present study compared with those identified previously. Secondary excitation bands are given in brackets

Variations in the PARAFAC components and other indices of CDOM

PCA was applied to the three fluorescent components (i.e., C1, C2, and C3), a(355), the slope ratio S R, and the DOC concentrations, to assess their relative distributions at all sampling locations (Fig. 6). The first two PCA axes explained 75.1 % of the total variance in the data set (PC1 = 59.1 %; PC2 = 16.0 %). PC1 axis showed strong positive loadings on DOC, a(355), the humic-like C1, and the tryptophan-like C2. The tyrosine-like C3 and the slope ratio S R showed high positive loadings on PC2 axis. A notable difference was found between the urban water samples collected from the two cities in North China Plain and the three cities in Northeastern China (Fig. 6). It can be shown that the water samples from the North China Plain were more clustered and the water samples from Northeastern in China were relatively scattered, particularly the water samples from Changchun and Harbin. The landscape water samples clustered with lower PC1 and PC2 loadings compared to the polluted waters for all the samples from the five cities of Northern and Northeastern China.

Fig. 6
figure 6

Principal component analysis of PARAFAC components and other indices of CDOM from all water samples

Statistical variability in the CDOM components within a city

Assessing CDOM dynamics from different cities can result in dispersed data that must be analyzed statistically to study variations within the study area. DOC is an important water pollution indicator (Florescu et al., 2013; Yu et al., 2015; Siepak, 1999); thus, samples across the Changchun city (Fig. 2a) containing lightly to severely polluted waters were analyzed by clustering analysis (CA). The CA was based on the three EEM-PARAFAC-extracted components for these samples across Changchun and grouped the sites into three main clusters at a distance measure of 5.0 on the dendrogram Fig. 7. Cluster A consisted of bulk landscape water sites (CC17–26 and CC1–6), which contained underground water, tap water, and treated effluents. The lightly polluted sites (CC2, 6, and 7) were separated from the landscape water sites within sub-clusters of relatively clear waters. Cluster B represented severely polluted landscape waters (CC8–10 and CC11, 15, and 16). Cluster C was composed of industrial and household sewage waters (CC12 and 14). These CA results are reasonable based on the relative pollution conditions and water sampling sites across the Changchun city and agree with the variations in DOC concentrations of waters from clusters A–C across Changchun city.

Fig. 7
figure 7

Dendrograms based on the three excitation-emission matrix fluorescence and parallel factor analyses (EEM-PARAFAC) components for all sampling sites across the Changchun city. CC Changchun in Northeastern China

Spatial variations and pollution levels versus extracted CDOM components

The spatial changes and physicochemical properties of urban waters have significant effects on CDOM dynamics (Jaffé et al., 2008). The DOC concentrations in urban waters within a study area are associated with relative pollution levels (Florescu et al., 2013; Siepak, 1999). Therefore, water samples with relatively high DOC concentration gradients from Harbin, Changchun, and Tianjin were chosen to assess the CDOM dynamics of urban waters, respectively (Fig. 8). Water samples with three DOC concentration gradients (i.e., type I, very low DOC concentration; type II, intermediate DOC concentration; and type III, very high DOC concentration) were selected from each city to represent typical lightly polluted to severely polluted waters and sewage water. Only the type I and II samples were selected from Tianjin because of the absence of a sewage sample (type III). The C1 and C2 fluorescence intensities of these urban water samples increased simultaneously with DOC concentrations increasing from types I to III in a specific study city (Fig. 8). These results indicated that the fluorescence intensities of the three CDOM components in water samples from the same city differed due to pollution levels. Fluorescence intensity was attributed to the C2 component when the water was severely polluted, as reported by Baker and Inverarity (2004), which can obscure humic-like C1 and tyrosine-like C3 components due to the distinctively high tryptophan-like C2 component. Notably, the fluorescence intensity of C1 (i.e., humic-like component) of the selected samples increased gradually from Tianjin to Changchun, and to Harbin, regardless of whether the samples were type I or II. This can be explained by the gradual accumulation of CDOM in water from allochthonous sources originated from landscapes with the gradients of soil organic enrichment from Northern to Northeastern China.

Fig. 8
figure 8

F max values of the three components (C1, C2, and C3) in Type I-III waters from Tianjin, Changchun, and Harbin, respectively. TJ Tianjin in Northern China, CC Changchun, and HB Harbin in Northeastern China. Numbers correspond to the sampling sites in Fig. 2a

CDOM components versus water quality parameters

Significant positive linear correlations were exhibited between a(254) (R 2 = 0.89, p < 0.01), a(355) (R 2 = 0.94, p < 0.01), and F max for the humic-like component (C1), respectively, when the entire data set (n = 84) was pooled (Fig. 9a and Table 4). These results agree with those from previous studies (Holbrook et al., 2006; Yao et al., 2011; Zhang et al., 2010, 2011). The linkage of fluorescence signals with DOC is very complex because of the effects of the steady and labile fluorescent and non-fluorescent CDOM components caused by soil types, rainfall, sewage discharge, and algal blooms, which affect DOC concentrations (Hudson et al., 2007; Jaffé et al., 2008). In accordance with Yao et al. (2011), a positively linear correlation (R 2 = 0.77) was found between DOC and C1 derived from terrestrial humic substances, but a relatively weak relationship (R 2 = 0.56) was revealed between DOC and C2 from plant decay through microbial activities or human and animal waste polluted water (Fig. 9b, c). DOC in the urban water samples mainly originated from terrestrial sources and a small portion from sewage, which explains the accumulation of much higher DOC concentrations in landscape waters (mean, 13.38 ± 2.82 mg L−1) in Northeastern China due to the unique black soil than that of 5.79 ± 1.83 mg L−1 in Northern China. Our results show stronger correlations between the EEM-PARAFAC-extracted CDOM components for the humic-like C1 component and the corresponding DOC concentrations compared with Holbrook et al. (2006).

Fig. 9
figure 9

Correlations between fluorescence intensity of F max (C1) and a absorption coefficients a(355), b dissolved organic carbon (DOC), and F max (C2) with c DOC, d total nitrogen (TN), e ammonia-N (NH4-N), and f chemical oxygen demand (CODMn), respectively

Table 4 Correlation coefficients (R) and significance levels (p) for the linear relationships between CDOM absorption, dissolved organic carbon (DOC), total nitrogen (TN), total phosphorous (TP), ammonia-N (NH4-N), chemical oxygen demand (CODMn), and the fluorescent components (C1, C2, and C3), respectively

Water quality is typically characterized by parameters, such as CODMn, NH4-N, TN, and DOC. NH4-N and CODMn are important urban water pollution indicators (Baker et al., 2003; Yin et al., 2011). The highest DOC, TN, CODMn, and NH4-N values are observed in type V of the excitation-emission matrix fluorescence spectra for wastewater in our study. The strong fluorescence intensities of the protein-like components derived from treated or untreated sewage effluents have led to the application of EEM as a useful tool for monitoring water quality (Henderson et al., 2009). Zhang et al. (2011) and Yao et al. (2011) reported on that there were strong positively linear relationships between DOC, CODMn concentration, and the humic-like components (C1 and C2). However, we found positive correlations between F max (C2) and TN (R 2 = 0.78). The tryptophan-like C2 intensity was moderately correlated with NH4-N (R 2 = 0.68) and CODMn (R 2 = 0.52), except the unavailable data for 22 water samples collected from Harbin (Fig. 9d–f and Table 4). These results are inconsistent with previous reports regarding the correlations between CODMn, NH4-N, and the tryptophan-like component (T) (Baker and Inverarity 2004; Vasel and Praet 2002), and even almost no correlations were exhibited between protein-like components and NH4-N or CODMn in other studies (Baker et al., 2003; Wu et al., 2006). The close relationships between the fluorescence intensities F max (C2) and traditional water quality parameters indicated that the protein-like C2 can be used as a quick-and-easy detection method to evaluate urban water pollution levels across China rather than relying on traditional water testing approaches, which are time and labor consuming.

Conclusions

In this study, EEM-PARAFAC was utilized to characterize three fluorescent components in CDOM from 84 urban water samples collected from July 2013 to July 2014 in cities of Northern and Northeastern China. One terrestrial humic-like (C1), one tryptophan-like component (C2), and one tyrosine-like component (C3) were identified using PARAFAC. The mean fluorescence intensities of the three CDOM components differed from one body of water to another and from one study area to another. CA was used to group the samples into groups with similar pollution levels in different waters within a study area. Fluorescence intensity was characterized by the tryptophan-like component (C2) when the water was severely polluted. The CDOM concentration increased gradually in water from allochthonous sources derived from a transitioning soil types from Northern to Northeastern China. Strong linear correlations were observed between CDOM absorption coefficients a(254), a(355), and F max (C1), respectively. A positively linear correlation was revealed between DOC and the humic-like component C1, but a relatively weak correlation was observed with the tryptophan-like component C2. The results indicated that DOC concentrations in urban waters were primarily controlled by the humic-like C1 component. A strong positive correlation was observed between the intensity of the tryptophan-like C2 component and TN, while moderate correlation was exhibited between the intensity of the tryptophan-like C2 component and NH4-N and CODMn. The fluorescence intensities of the CDOM components reflected the impacts from anthropogenic activities and provided real-time water quality information compared with the traditional approaches. These results demonstrate that the EEM-PARAFAC technique is useful to evaluate the dynamics of CDOM fluorescent components in urban waters from Northern and Northeastern China.