Introduction

As an important part of the earth ecosystem, lakes provide important ecosystem services such as resilience to climate change, productivity through biodiversity, hydrological regulation, fisheries, recreation, and tourism (Schallenberg et al. 2013; Zhang et al. 2019). Water quality can influence the health of lake wetland systems and its utility to human society as a source of water (Wang et al. 2019). Drivers of changes in lake water quality are complex (Han et al. 2020) because lakes are affected by various anthropogenic activities and natural factors, with the former being most prominent (Xian et al. 2007; Huang et al. 2010). The complexity of pollution sources and their interactions also pose significant challenges to managing surface water resources (Elhatip et al. 2007). Long-term water quality monitoring data and river basin pollutant discharge data can be a valuable resource for the characterization of temporal variations in water quality and the understanding of complex drivers of water quality (Wan et al. 2017; Han et al. 2020). This knowledge can act as important information for the formulation of improved pollution control strategies and water resources management in lakes (Chang 2005; Liu et al. 2020).

Honghu Lake is the seventh-largest freshwater wetland system in China (Li et al. 2018). Similar to many large lakes in China, the water quality of this shallow lake had deteriorated severely under pressure by intense human activities before pollution control measures were implemented (Mo et al. 2009). The pollution of Honghu Lake is considered the result of the combined effects of large-scale enclosure aquaculture in the lake and the inflow of external pollutants (Gui and Yu 2008; Zhang et al. 2017). The Honghu Lake Wetland Protection and Restoration Demonstration Project has been implemented since 2004 and included interventions such as the dismantling of aquaculture enclosures, the interception of industrial and domestic wastewater, and the reconstruction of the ecosystem (Chang et al. 2015). Besides, controls on livestock farming and fertilizer use were also tightened in 2010 and 2013, respectively (Chen et al. 2020). The long implementation period of restoration measures and the fact that multiple measures were implemented together also makes it difficult to assess the actual effect of a particular measure. While these measures have resulted in the effective control of some pollution sources, the concentrations of certain pollutants in Honghu Lake still far exceed desired levels (Li et al. 2017a). Previous studies have provided some useful information for the environmental protection of Honghu Lake, but these mainly focused on heavy metals (Li et al. 2017b; Liu et al. 2018), spatial variations in water quality (Li et al. 2015), and ecological security (Mo et al. 2009). Few studies have focused on quantifying the impact of various pollution sources and pollution control measures on water quality and exploring further measures to achieve the desired water quality levels.

Multivariate statistical techniques, such as cluster analysis (CA), correlation analysis, and principal component analysis (PCA)/factor analysis (FA), are effective approaches for analyzing large monitoring datasets for the grouping of multiple variables and for facilitating the identification of potential sources of pollution (Singh et al. 2005a; Ouyang et al. 2006; Zhou et al. 2007a; Su et al. 2011; Xiao et al. 2016; Gurjar and Tare 2019). Receptor-based models, such as absolute principal component score-multiple linear regression (APCS-MLR), are able to quantify the contribution of different components (Gholizadeh et al. 2016; Liu et al.,2020). However, to quantify the contribution of a specific pollution source to water quality indicators, it is necessary to establish a connection between the pollution source and the principal components of APCS-MLR. The realization of this goal requires both long-term water quality monitoring data in the lake and pollutant discharge data for the basin.

The present study was based on water quality monitoring data for Honghu Lake and pollutant discharge data for the Honghu Lake Basin collected over a long period. Monitoring data for nine water quality parameters as well as datasets for six potential sources of pollution to the lake were selected for 2004 to 2017. These data were analyzed using the water quality index, multiple statistical techniques, and source apportionment methods in conjunction with the seasonal characteristics of rainfall in the watershed. The objectives of the present study were to (1) characterize temporal trends in lake water quality; (2) identify the main driving forces affecting water quality indicators; (3) quantify the contribution of pollution sources in the Honghu Lake Basin to the water pollution of Honghu Lake; and (4) provide suggestions for further water quality management of Honghu Lake.

Materials and methods

Study area

Honghu Lake, located in the southeast portion of the Jianghan Plain, is the seventh-largest freshwater lake in China (113° 12′–113° 28′ E, 29° 41′–29° 58′ N) (Li et al. 2018). The lake is classed as a shallow-water dammed lake (average depth 1.16 m) (Cai and Yi 1991) and is connected to the four lakes main canal, Luoshan main canal, and the Yangtze River in the north, west, and south, respectively (Fig. 1). The main rivers flowing into Honghu Lake are the four lakes main canal and Luoshan main canal, which receive various pollutions from agriculture, livestock and poultry breeding, industrial, and domestic sewage (Wang et al. 2017). Honghu Lake belongs to Jingzhou City in Hubei Province. A survey in 2015 (Jingzhou Hydrology Bureau) determined that Honghu Lake has a surface area and catchment area of 348 km2 and 5980 km2, respectively, with the basin hosting 3.3 million permanent residents. Honghu Lake is a national wetland reserve and fulfills important ecological service functions. The lake acts as an important breeding base and wintering habitat for many rare and endangered bird species as well as a gene bank for wetland species in central China. The lake also serves a critical function in regulating floods, hosts important fisheries, and acts as a water supply for industry and agriculture (Zhang et al. 2017). The study area has a subtropical humid monsoon climate, with an average annual temperature of 16–19 °C and yearly average precipitation between 900 and 1350 mm. Rainfall in the area is abundant but uneven. Summers are hot and rainy, whereas winters are cold and dry (Chen et al. 2020). The regional phreatic aquifer consists of sandy silt, fine sandy silt, mild clay, etc. with poor permeability (Zhang et al. 2017).

Data source

Monthly water quality and water depth data from eight monitoring stations spanning 2004–2017 were obtained from Honghu Environmental Monitoring Center. The water quality variables studied included pH, dissolved oxygen (DO), the potassium permanganate index (CODMn), biochemical oxygen demand (BOD), ammonia nitrogen (NH3-N), total nitrogen (TN), total phosphorus (TP), fluoride (F), and fecal coliforms (F. coli). The present study did not include heavy metal indicators since the concentrations of all heavy metal indicators were within the standard range and the majority of heavy metal indicators fell below the minimum detection concentration. The low-risk level of heavy metals in Honghu Lake has also been reported in other studies (Makokha et al. 2016; Li et al. 2017b). Analyses were performed following the instructions outlined in the Technical Specifications Requirements for Monitoring of Surface Water and Wastewater published by the Ministry of Ecology and Environment of the People’s Republic of China (standard number: HJ/T 91-2002). In the case of a determined water quality variable falling below the detection limit, that variable was set to the detection limit (Farnham et al. 2002).

Fig. 1
figure 1

Location of Honghu Lake, Hubei Province, China, and the spatial distribution of the eight monitoring sites used in the present study.

Previous studies have suggested that pollution from precipitation, enclosure aquaculture, domestic wastewater, industrial wastewater, fertilizer use on farmland, and livestock production in the basin may be important contributors to the deterioration of water quality in Honghu Lake (Zhang et al. 2017; Li et al. 2019). This study managed to collect annual information on these six pollution sources in the Honghu Lake Basin from 2004 to 2017, including annual precipitation (AP), the area of enclosure aquaculture in Honghu Lake (EA), untreated domestic sewage discharge (UDS), untreated industrial wastewater discharge (UIW), fertilizer use (FU), and livestock and poultry production (LPP). AP and FU data were obtained from the Jingzhou Statistical Yearbook and the UIW and UDS data from the environmental statistics of the Jingzhou Ecology and Environment Bureau. Livestock and poultry breeding data were obtained from the Jingzhou Agricultural and Rural Development Center. The number of livestock and poultry was converted into the number of standard pigs according to the conversion relationship of the pollutant output (SEPA 2001). LPP represented the total number of standard pigs after conversion. EA was determined through visual interpretation of synthetic aperture radar (SAR) satellite imagery combined with aquaculture statistics for Honghu Lake. The SAR imagery data were mainly sourced from the satellite instruments of the Environmental Satellite (ENVISAT) advanced synthetic aperture radar (ASAR), European Remote Sensing (ERS) SAR, and Sentinel-1 SAR. The selected time period of the imagery was September or October.

Analysis methods

Division of water period (DWP)

The precipitation in the Honghu Lake Basin shows a cyclical trend. However, the study on the water period of Honghu Lake has not been reported yet. Honghu Lake is the largest catchment area of the Honghu Lake Basin. Rainfall will transport pollutants to Honghu Lake through storm runoff, ditches, and streams, thereby inevitably cause seasonal changes in the water depth and water pollution characteristics of Honghu Lake. The present study proposes a novel classification method to study changes in water quality during the year. Each year of data for Honghu Lake was first separated into a dry period (DP) and a wet period (WP) according to the changes in lake water depth. The specific classification method was as follows:

$$ R=\left[\begin{array}{cc}\begin{array}{cc}{r}_{11}& {r}_{12}\\ {}{r}_{21}& {r}_{22}\end{array}& \begin{array}{cc}\cdots & {r}_{1j}\\ {}\cdots & {r}_{2j}\end{array}\\ {}\begin{array}{cc}\vdots & \vdots \\ {}{r}_{i1}& {r}_{i2}\end{array}& \begin{array}{cc}\ddots & \vdots \\ {}\cdots & {r}_{ij}\end{array}\end{array}\right] $$
(1)
$$ S=\left[\begin{array}{cc}\begin{array}{cc}{s}_{11}& {s}_{12}\\ {}{s}_{21}& {s}_{22}\end{array}& \begin{array}{cc}\cdots & {s}_{1j}\\ {}\cdots & {s}_{2j}\end{array}\\ {}\begin{array}{cc}\vdots & \vdots \\ {}{s}_{i1}& {s}_{i2}\end{array}& \begin{array}{cc}\ddots & \vdots \\ {}\cdots & {s}_{ij}\end{array}\end{array}\right] $$
(2)
$$ {\overline{t}}_j=\frac{s_{1j}+{s}_{2j}+\dots {s}_{ij}-{s}_{maxj}-{s}_{minj}}{j-2} $$
(3)
$$ \overline{T}=\left[{\overline{t}}_1\kern0.5em {\overline{t}}_2\kern0.5em \begin{array}{cc}\cdots & {\overline{t}}_j\end{array}\right] $$
(4)

In Eq. (1) to Eq. (4), R represents the data matrix of water depth; i is the number of years; j is the month of each year; S represents the score matrix; sij is the rank number of rij when sorting ri1, ri2, ⋯ rij from low to high; smaxj and sminj represent the maximum and minimum values among s1j, s2j, …, sij, respectively; and \( {\overline{t}}_j \) is the average score in month j. In the present study, the number of years (i) = 14 and the number of months ( j) = 12. Finally, \( {\overline{t}}_1,{\overline{t}}_2,\cdots, {\overline{t}}_j \)were clustered into two groups by CA, and the individual groups of months with the highest and lowest scores were designated as the wet and dry periods, respectively. The rationality of water period division was assessed using discriminant analysis. The principles and results of discriminant analysis were provided in Supplementary material.

Temporal variations in lake water quality

To understand the water pollution of Honghu Lake during 2004–2017, it is necessary to analyze the annual and inter-annual changes in the water quality of Honghu Lake. The water quality index method is a numerical method that can be used to convert a large number of water quality parameters into a single index to describe the overall state of water quality (Han et al. 2020). However, the traditional water quality index method has certain limitations due to the subjective assignment of weights to each water quality parameter (Singaraju et al. 2018). The CCME-WQI is a water quality assessment method developed by the Council of Environment Ministers of Canada, which can provide a more comprehensive and objective reflection of the water quality status by integrating information for the percentage of indicators that exceed the standards, the percentage of monitoring sections in which standards are exceeded, and the extent of exceedance of standards (Dede et al. 2013; Yan et al. 2016). The variables used in the study were the water quality variables mentioned above. As a habitat for rare aquatic organisms, Honghu Lake was assessed according to China’s Environmental Quality Standard for Surface Water (GB) Class II (SEPA 2002). The pH was 6–9, DO was 6 mg L−1, CODMn was 4 mg L−1, BOD5 was 3 mg L−1, NH3-N was 0.5 mg L−1, TP was 0.025 mg L−1, TN was 0.5 mg L−1, F was 1 mg L−1, and F. coli < 2000 colony forming units (cfu) L−1. The method used for calculating the CCME-WQI scores was as follows:

$$ CWQI=100-\frac{\sqrt{F_1^2+{F}_2^2+{F}_3^2}}{1.732} $$
(5)

In Eq. (5), F1 is factor 1 and was labeled as the “scope.” It gives the percentage of indicators which exceed the objective value relative to the total number of indicators. F2 is factor 2 and was labeled as the “frequency.” It gives the percentage of tests which exceed the objective value relative to the total number of tests conducted during monitoring process. F3 is factor 3 and was labeled as the “amplitude.” It gives the average extend of exceeding the objective value. The more detailed computation procedure of F1, F2, and F3 can be found in Dede et al. (2013). The score values were scaled to range from 0 to 100 and were ranked in five categories: (1) poor (0–44), (2) marginal (45–59), (3) fair (60–79), (4) good (80–94), and (5) excellent (95–100) (Gao et al. 2016).

Multivariate statistics

Water pollution cannot be assigned directly to specific pollution sources by APCS-MLR. Thus, correlation analysis and PCA were used to gradually establish the relationship between pollution sources in the Honghu Lake Basin and APCS-MLR factors. Correlation analysis can estimate the strengths of relationships between variables; thus, it is an effective method to identify the pollution sources acting as the drivers of various water quality indicators (Reimann et al. 2017). Pearson correlation coefficients between variables greater than 0.6 or in the range 0.4–0.6 when significance level P < 0.05 indicated significant or moderate correlations, respectively (Li et al. 2013). In addition to calculating the Pearson correlation coefficient between pollution sources and water quality indicators, the correlation coefficient between the six pollution sources was also calculated to prevent confusion in determining the correlation between specific pollution sources and water quality indicators. The results of the correlation analysis will provide the foundation for determining the pollution source category represented by each factor in PCA.

PCA is a statistical algorithm that transforms multiple variables into several comprehensive variables to reflect the majority of original information, thus reducing the dimensionality of the data (Wang et al. 2016; Liu et al. 2020). FA further simplifies the structure of the PCA by rotating the defined axis and constructing varifactors (VF) to describe the unobservable potential factors that contribute to the compositional pattern of water quality data (Han et al. 2020). PCA was used to analyze the DP and WP water quality data of the optimal year and the worst year for the water quality identified by CCME-WQI. Since the pollution driver of each water quality indicator has been determined by correlation analysis, the pollution category represented by each VF can be judged by the driver of the strong loading indicator in VF. The Kaiser-Meyer-Olkin (KMO) and Bartlett’s sphericity tests were conducted to evaluate the validity of PCA, which required that the KMO values be > 0.5 and Bartlett’s test to indicate significance (P < 0.001) (Gao et al. 2016). Only components with eigenvalues greater than 1 were retained until the cumulative variance was higher than 75%. All PCAs conducted in the current study reserved the same amounts of components to facilitate the comparison of results (Gao et al. 2016). The pollution categories identified by PCA were the pollution categories for source apportionment in the APCS-MLR. Therefore, the result of PCA will be the link between the pollution sources in the Honghu Lake Basin and the pollution categories in the APCS-MLR.

Source apportionment using the APCS-MLR model

APCS-MLR model is an important receptor model and multi-source analysis method that is constructed by combining the multiple linear regression model (MLR) with the de-normalized absolute principal component scores (APCS) produced by PCA (Singh et al. 2005a, 2005b; Su et al. 2011; Gholizadeh et al. 2016; Chen et al. 2019). A detailed description of APCS can be found in Thurston and Spengler (1985) and Zhou et al. (2007c). The source contributions to the concentration of water quality indicators can be calculated by using a multiple linear regression (MLR) as:

$$ {C}_i={\left({r}_0\right)}_i+{\sum}_{j=1}^p{r}_{ij}\times {APCS}_j $$
(6)

where Ci is the concentration of ith indicator, (r0)i is constant term of obtained by the MLR for indicator i, rij is coefficient of multiple regression of the pollution category j for indicator i, APCSj is the scaled value of the rotated factor j for the considered sample, and rij × APCSj represents the contribution of pollution category j to Ci. The current study used APCS-MLR to further analyze the contribution of each pollution category identified by PCA to the concentration of various water quality indicators. Since the pollution sources represented by the strong loading indicators in each pollution category have been determined through correlation analysis, the source contributions of these pollution sources can also be calculated through APCS-MLR. Moreover, the effect of controlling the drivers of water quality indicators can be quantified by comparing the changes in the contribution of the same pollution category to the same water quality indicator for different periods.

Results and discussion

Changes in six pollutant sources in the basin between 2004 and 2017

The data for each pollution source was processed using the maximum normalization (x/xmax) method to allow a comparison of trends among the six pollution sources. As shown in Fig. 2, the results demonstrated significant changes to EA, UDS, and UIW from 2004 to 2017. There has been an evident decline in UDS since 2006, and by 2017, the UDS was only 7.86% of that in 2006, which may be related to the improvement of the sewerage network and the increase in sewage treatment rates in the basin. UIW increased sharply between 2010 and 2012 and then showed a decreasing trend year by year, which could be attributed to the acceleration of industrialization and changes in environmental protection policies (Fu et al. 2010; Li et al. 2017a, 2017b). From 2004 to 2017, EA in Honghu Lake first decreased, then increased, and finally decreased to 0. Dai et al. (2019) attributed the fluctuations in the aquaculture area in Honghu Lake to environmental policy drivers. Zhang et al. (2017) reported that two aquaculture enclosures removal projects had been implemented in Honghu Lake at the end of 2004 and 2016. Since 2017, no enclosure aquaculture in any form has been allowed in Honghu Lake due to further escalation of protection of the lake. FU and LPP fluctuated slightly between 2006 and 2017, with both showing an initial gradual increase followed by a gradual decrease, peaking in 2011 and 2013, respectively. The decrease in FU was related to adjustments in local government recommendations of the use of crop fertilizers in 2010 with the aim of reducing agricultural non-point source pollution (Chen et al. 2020). A possible explanation for the decline in LPP is that the promulgation of the new version of the “Regulation on the Prevention and Control of Pollution from Large-scale Breeding of Livestock and Poultry (2013)” resulted in the operations of many non-compliant small livestock and poultry farms being halted by local government. The Honghu Lake Basin belongs to the subtropical humid monsoon climate. Although AP in the Honghu Lake Basin showed fluctuating changes, there was an overall fluctuating upward trend in precipitation after 2012.

Fig. 2
figure 2

Changes of six possible pollution sources in the Honghu Basin from 2004 to 2017 (the maximum value standardization method has been used to normalize each pollution source). AP annual precipitation, EA area of enclosure aquaculture, UDS untreated domestic sewage discharge, UIW untreated industrial wastewater discharge, FU fertilizer use, LPP livestock and poultry production.

Division of water period

The average water depth scores shown in Fig. 3a illustrate that the water depth of Honghu Lake was significantly higher during the second half of the year compared to the first half of the year. CA was performed on the average monthly water depth scores to accurately divide each year into a DP and WP. The results shown in Fig. 3b demonstrate that the monthly scores can be divided into two clusters at (Dlink/Dmax) × 100 < 5 and that the difference between the clusters was significant (Zhou et al. 2007a). Cluster 1 (WP) contained 5 months from June to October, including summer and early autumn. During this period, the water level of the Yangtze River was higher than that of Honghu Lake, and consequently the floodgates of Honghu Lake were opened to store water, allowing flow from the Yangtze River into Honghu Lake and causing the water level of Honghu Lake to rise (Li et al. 2015). In addition, the total precipitation during this period accounting for ~78.7% of annual precipitation (Jingzhou Statistical Yearbook) and the majority of the runoff resulting from this precipitation flowed into Honghu Lake, resulting in a further increase in water level. Moreover, high water temperatures and abundant water resources provided good conditions for fish growth; thus, enclosure aquaculture was also very active during this period (Zhang et al. 2017). Cluster 2 (DP) included the remaining 7 months. During this period, the water level of the Yangtze River fell below that of Honghu Lake. As preparation for flood control in the next year, the sluice gates of Honghu Lake were opened to release water. In addition, there was less rainfall in the Honghu Basin during the DP, which resulted in a decline in the water depth of Honghu Lake during the DP. The accuracy of the water period division results was provided in the Supporting material Table S1. Ouyang et al. (2006) assessed the surface water quality of the St. Johns River according to seasonal variations; however, this method is not applicable to regions in which seasons cannot be clearly isolated. Li et al. 2017a, 2017b) divided the water period of Honghu Lake into dry (January to April), wet (May to August), and normal seasons (September to December) based on long-term experience. However, the results of the present study demonstrated that such a classification result lacks accuracy. In general, the hydrological conditions and pollution sources of many rivers and lakes will change periodically, and a suitable method of categorizing water periods could assist in more accurately identifying important parameters that affect water quality (Baldwin and Mitchell 2000; Baldwin et al. 2008).

Fig. 3
figure 3

a A histogram showing average monthly water depth scores for the period 2004–2017; b a cluster analysis classification dendrogram of the monthly average water depth scores.

Changes in water quality trends

The annual average concentration changes of nine water quality indicators in Honghu Lake from 2004 to 2017 were shown in Fig. 4. The results showed that CODMn, BOD5, TP, and TN were the water quality indicators with concentrations exceeding the desired standard for most study periods. Therefore, the pollution sources affecting the concentration of CODMn, BOD5, TP, and TN will be the most important pollution sources affecting the water quality of Honghu Lake. Temporal trends of water quality conditions over WP, DP, and whole year (WY) in Honghu Lake were identified by CCME-WQI (Fig. 5). The results showed that the water quality of Honghu Lake could be divided into a rising phase (2004–2011) and a declining phase (2012–2017) during the study period. It is worth noting that the water quality over the WP is generally better than that over the DP between 2004 and 2011, while the water quality over the DP is generally better than that over the WP between 2012 and 2017. Combined with the results in Fig. 2, it was speculated that the occurrence of the rising water quality period might be attributed to the significant decrease in EA and UDS between 2004 and 2011, while the declining water quality period may be attributed to the significant increase in EA and AP between 2012 and 2017. Also, which period of WP and DP has better water quality was also considered the result of changes in UDS, AP, and EA because the emissions of these pollution sources were seasonal in Honghu Lake Basin (Zhang et al. 2017). The best year for water quality was 2012, which may result from lower AP, EA, and UDS in this year. The worst years for water quality in the rising and falling water quality periods were 2005 and 2017, respectively, but none of the six pollution sources in the Honghu Lake Basin have reached an extreme value in these 2 years. Thus, the unsatisfactory water quality in these 2 years was believed to be the result of the disturbance of the lake sediments resulting from the dismantling of the aquaculture enclosures (Pettersson 2001).

Fig. 4
figure 4

Water quality parameters of Honghu Lake with respect to time series from 2004 to 2017 with level of Environmental quality standards for surface water Class II.

Fig. 5
figure 5

Temporal variations of Canadian Council of Ministers of the Environment Water Quality Index (CCEM-WQI) in Honghu Lake from 2004 to 2017 for different periods related to hydrological condition (WP wet period, DP dry period, WY whole year).

Identification of the drivers of water quality indicators

Although negative correlations between variables have been considered in some studies (Han et al. 2020; Liu et al. 2020), they were not considered in the current study since pollution sources (besides AP) selected in the present study were direct contributors to the deterioration of water quality indicators (Fig. 6). Pollution sources that had a significant correlation with water quality indicators were considered the dominant drivers of this indicator. In this study, when the correlation coefficient between variables was greater than 0.4, the significance level P < 0.05 for all. Correlation analysis between pollution sources showed that besides the correlation between UIW and LPP (r=0.82), there were no obvious correlations between other variables. The results of correlation analysis between water quality indicators and pollution sources showed a significant correlation between EA and pH (0.79), NH3-N (0.83), TP (0.67), and TN (0.63), indicating that enclosure aquaculture may be the main driver of changes to pH and nutrient pollution in the Honghu Lake. This result is consistent with Bian et al. (2012) and Querijero and Mercurio 2016), who showed that aquaculture was the main source of nutrients in lakes and an important driver of water eutrophication. Yang et al. (2017) attributed an increase in pH in Dianchi Lake, China, to higher temperatures, resulting in stronger photochemical synthesis processes and increased carbon dioxide consumption by algae. However, the field investigation in the current study found that aquaculture farmers regularly applied quicklime to the water to neutralize acidity generated by the decomposition of feed residues and fish feces, as well as to sterilize and improve the water quality. Therefore, aquaculture in some lakes may be an additional important driver of changes in pH. Significant correlations between UDS and BOD (0.88) and F. coli (0.91) were identified in Honghu Lake. This result is supported by Han et al. (2020), who attributed the BOD and F. coli pollution in Baiyangdian Lake to the discharge of domestic sewage. Environmental statistics showed that the average influent concentrations of BOD and F. coli to the Jingzhou Sewage Treatment Plant were 63 mg L−1 and 240,000 cfu L−1, respectively. Clearly, the discharge of untreated sewage would have a considerably negative impact on the environment.

Fig. 6
figure 6

Pearson correlation coefficient between lake water quality and pollution sources. AP annual precipitation, EA area of enclosure aquaculture, UDS untreated domestic sewage discharge, UIW untreated industrial wastewater discharge, FU fertilizer use, LPP livestock and poultry production.

UIW showed significant correlations with CODMn (0.74) and F (0.86). Qin et al. (2007) and Gao et al. (2016) similarly attributed the decline in CODMn concentration to reducing industrial wastewater discharge and an increase in urban sewage treatment rates. Fluorine in industrial wastewater usually originates from fluoride mines, cement plants, fluorine chemical factories, phosphorus fertilizer plants, and smelters (Huang et al. 2010; Su et al. 2011). Information provided by the Industry and Commerce Bureau showed that although there are no fluorine mines in the Honghu Lake Basin, there are a few cement, fertilizer, and chemical factories. The direct discharge of wastewater from some such enterprises may be the reason for UIW being related to F concentration in Honghu Lake (Duan et al. 2016; Fuge 2019). The Pearson correlation coefficients between LPP and F and F. coli were 0.83 and 0.77, respectively. Livestock and poultry breeding are dominant industries in the Honghu Lake Basin. Fluorine in LPP mainly originates from the fluoroquinolone antibiotics (Wang et al. 2017). Although antibiotics have a limited effect on F concentration, Honghu Lake has a low F background value of 0.25 mg L−1, and therefore, the impact of LPP on F cannot be ignored. DO showed no significant correlations with the six pollution sources, which may be related to seasonal or biological factors, such as temperature, wind speed, and aquatic plants (Singh et al. 2005b; Zhou et al. 2007b). AP showed significant correlations with TP (0.75) and NH3-N (0.71), whereas AP showed moderate correlations with TN (0.52). Rainfall can act as a driver of non-point source pollution as rainfall-runoff washes pollutants from the land surface into the lake (Hülya and Hayal 2008). Non-point source pollution is aggravated by agriculture, which is a dominant land use in the Honghu Lake Basin. Excessive application of chemical fertilizers results in the loss of fertilizers during rainfall, thereby explaining the observation of rainfall being significantly correlated with TN and TP (Chen et al. 2013; Gao et al. 2016). Although agricultural non-point source pollution results from the combined effect of excessive fertilization and rainfall, the results of the correlation analysis implied that rainfall played a decisive role under conditions of little change in fertilizer use. As shown in Fig. 4, the indicators in Honghu Lake that exceeded the standards included CODMn, BOD5, TP, and TN. Combined with the results of the correlation analysis, it is clear that the pollution generated by EA, AP, UDS, and UIW in recent years may be the main factor for the failure of water quality in Hong Lake to meet the desired standard.

Identification of potential pollution categories by PCA

By comparing the difference in pollution status between the optimal and worst water quality years, the role of controlling pollution sources could be more intuitively reflected. Based on the results of CCME-WQI and considering minimizing the interference of other pollution sources to the analysis of APCS-MLR, the water quality data for the dry and wet periods in 2012, 2004, and 2016 were finally used to explore the pollution situation for the optimal and worst periods of water quality in Honghu Lake. The results of KMO for the three dry periods were 0.532, 0.551, and 0.643, respectively, whereas those for Bartlett’s sphericity were 84, 96, and 130 (P < 0.001). KMO results for the three wet periods were 0.513, 0.74, and 0.747, respectively, whereas those for Bartlett’s sphericity were 81, 191, and 159 (P < 0.001). The results of KMO and Bartlett’s sphericity indicated that PCA was appropriate and would be useful for providing significant reductions in data dimensionality for each dry and wet period (Su et al. 2011; Dugga et al. 2020). Following the threshold recommended by Shrestha and Kazama (2007), absolute values of indicator loadings > 0.75, 0.5–0.75, and 0.3–0.5 were considered to be strong, moderate, and weak, respectively.

For water quality data analyzed over the DP ( Table 1), the first varifactor (VF1) in 2004 had strong and positive loadings on CODMn, F, and DC and a moderate positive loading on BOD5. Correlation analysis indicated that CODMn, F, DC, and BOD5 mainly originated from LPP, UIW, and UDS; thus, VF1 was classified as “mixed point source pollution” (Hülya and Hayal 2008). Similar classifications were made for VF2 for 2012 and VF1 for 2016 due to strong loading factors similar to VF1 for 2004. VF2 in 2004 had strong positive loadings on DO and TP and a moderate loading on TN. Honghu Lake is a shallow lake, and since the dry period is associated with cold and windy conditions, DO of the lake water is easily replenished. Therefore, VF2 for 2004 was recognized as a “natural pollution source” (Han et al. 2020). In addition, VF1 in 2012 and VF2 in 2016 had strong DO loadings and were also identified as “natural pollutions sources.” VF3 in 2004 showed strong loadings on NH3-N and moderate loadings on BOD5 and TN, which can be attributed to the effect of EA and AP based on the results of correlation analysis. Therefore, VF3 in 2004 was identified as a “mixed non-point source” (Chen et al. 2013; Liu et al. 2020). Similarly, VF4 for 2012 and VF3 for 2016 were identified as “mixed non-point sources.” Finally, VF4 in 2004, VF3 in 2012, and VF4 in 2016 all showed a strong loading on pH. Changes to the pH of Honghu Lake were related to the application of quicklime by aquaculture farmers, and thus, they were identified as a “chemical pollution source” (Ma et al. 2019). Finally, each VF of PCA was identified as either a “mixed point source,” “natural pollution source,” “mixed non-point source,” or “chemical pollution source” for DP over 3 years.

Table 1 Loadings of nine selected variables on VARIMAX rotated factors of dry periods (DP) for different years in Honghu Lake

The analysis results over the WP (Table 2) showed that VF1 in 2004 had strong loadings on CODMn, TP, and TN and moderate loadings on NH3-N and F. According to correlation analysis, CODMn mainly originated from UIW, TP and NH3-N mainly originated from rainfall-runoff and EA, TN mainly originated from EA, and F mainly originated from UIW and LPP. Therefore, VF1 was identified as a “mixed source” (Su et al. 2011). Similarly, VF1 of 2016 was identified as a “mixed source.” VF2 in 2004 showed strong positive loadings on BOD5 and F. coli, and a moderate loading on F, and based on the results of correlation analysis, was associated with “domestic pollution sources” (Najar and Khan 2012). Similarly, VF2 in 2016 explained 22.72% of total variance in BOD5 and similarly represented a “domestic pollution source.” VF3 in 2004 and 2012 and VF4 in 2016 showed a high loading on pH. Therefore, similar to VF4 in 2004 during the DP, they were identified as “chemical pollution sources.” VF4 in 2004 and 2012 showed strong loading on DO; the higher DO in Honghu Lake over the WP was considered the influence of aquatic plants (Fitch and Kemker 2014; Liu et al. 2019). In addition, VF1 for 2012 over the WP contained the same variables as VF1 for 2004 over the DP, which represented the mixed point source pollution, while VF2 for 2012 over the WP contained the same variables as VF3 for 2004 over the DP, which represented mixed non-point source pollution. Finally, VF3 in 2016 had a strong loading on TP and moderate loadings on F and DC. This principal component was associated with rainfall non-point source pollution (Zhang et al. 2020). The VFs of the wet periods were ultimately identified as one of the following seven types: (1) mixed source, (2) domestic pollution sources, (3) chemical pollution source, (4) aquatic plant sources, (5) mixed point source, (6) mixed non-point source, and (7) rainfall non-point sources.

Table 2 Loadings of nine selected variables on VARIMAX rotated factors of wet periods for different years in Honghu Lake

Compared with DP, the compositions of VFs in the WP were more complicated. Wet periods are characterized by high rainfall (Jingzhou Statistical Yearbook), and pollutants during this period will be mixed through surface runoff or river flow (Ma et al. 2018), thereby posing difficulties to identification by PCA. In addition, although the DP contained the same potential pollution categories every year, their rank changed, whereas the WP changed not only the rank but also the type of pollution. These results can be explained by the changing environment and pollutants in the Honghu Lake Basin (Li and Zhang 2010). For example, mixed point sources and mixed sources explained the majority of total variance during the DP and WP, respectively, and were the most important pollution categories in the dry and wet periods. UIW was highest in 2012; thus, F loading was the highest in the PCA for 2012. EA was lowest in 2012, and the corresponding loadings of TN and TP in 2012 were lower than those of the other 2 years. Duan et al. (2016) found that nutrients, organics, chemicals, heavy metals, and natural pollutants were the main pollution categories in Eastern Poyang Lake. Gao et al. (2016) determined that the main pollution categories of the Three Gorges Reservoir were mixed pollution, heavy metals, nutrient inputs, and organic pollutants. These previous studies and the current study indicate that although there are similarities in the pollution categories among Chinese lakes, each lake also has unique physical and chemical characteristics due to different natural and anthropogenic features (Huang et al. 2010). Li et al. (2017a) determined the potential pollution categories of Honghu Lake to be nutrients, climate, and geographical factors. The results of Li et al. (2017a) are similar to those of other studies that identified VFs through inference. Although those studies identified the main categories of pollution of the lake, they provided no clear guidance for tracking specific pollution sources and subsequent targeted reduction (Wang et al. 2011). Obviously, the identification of VFs based on the correlation between water quality indicators and pollutants will be more accurate and reasonable.

Source contribution based on APCS-MLR

Coefficients of determination (R2) shown in Tables 3 and 4 indicated that APCS-MLR was relatively accurate besides for TN over the DP in 2004 (0.39) and for TP over the WP in 2012 (0.46) (Zhou et al. 2007a). The influence of chemical sources on pH accounted for more than 50% over all periods, indicating that the application of lime played an important role in the pH of Honghu Lake. The contribution of strong sources on DO during the WP was higher than that during the DP, indicating that aquatic plants may strongly influence DO. UIW was the main driver of CODMn in Honghu Lake. Compared with 2012, UIW decreased by 65% in 2016, but this only reduced the contribution of mixed sources or mixed point sources to CODMn by 1.27 mg L−1 and 1.07 mg L−1 in wet and dry periods, respectively, and the influence of UIW to CODMn remained at a high level (46.6% in DP and 55.2% in WP). This result demonstrated that although the management of industrial wastewater discharge by the local government played a certain role in reducing the concentration of CODMn in Honghu Lake (Fu et al. 2010), existing potential pollution sources continue to have a strong influence on CODMn, which need to be further controlled.

Table 3 Concentration contribution of pollution sources to each water quality variable during dry periods in Honghu Lake, China. (mg/L)
Table 4 Concentration contribution of pollution sources to each water quality variable during wet periods in Honghu Lake, China. (mg/L)

UDS was the main driver of BOD concentration in Honghu Lake. Compared with the UDS in 2004, the UDS in 2016 have been reduced by about 85%, and the contribution of strong sources on BOD decreased 0.73 mg L−1 and 0.66 mg L−1 over the DP and WP, respectively. Furthermore, domestic sources seemed to have a higher impact on BOD during the WP than during the DP and were maintained at a relatively high level (> 50%). This phenomenon indicated the possible presence of other sources in identified domestic sources such as agitation of bottom mud that impact BOD during the WP besides domestic pollution sources (Lin et al. 2018). Correlation analysis identified NH3-N, TP, and TN as the most important indicators relating to aquaculture and rainfall. Compared with 2004, aquaculture in 2012 was reduced by 81%, but annual rainfall was similar. Correspondingly, the average contributions of strong sources to NH3-N, TP, and TN were reduced by 0.2 mg L−1, 0.039 mg L−1, and 0.37 mg L−1, respectively. In addition, aquaculture decreased by 51% in 2016 compared with 2004, but rainfall increased by 34%, resulting in the average contributions of strong sources to NH3-N, TP, and TN, which were reduced by 0.04 mg L−1, 0.023mg L−1, and 0.25mg L−1, respectively. The indicators that exceeded the water quality standards of Class II in 2016 were CODMn, TP, and TN, which exceeded by 0.09 mg L−1, 0.020 mg L−1, and 0.30 mg L−1, respectively. Comparing the changes in pollution emissions and source contributions in 2004, 2012, and 2016, it can be inferred that if UIW and EA are further reduced, the concentration of CODMn and TP in Honghu Lake will meet the water quality standards. However, after enclosure aquaculture is completely dismantled, it is necessary to control rainfall non-point source pollution further so that the TN concentration can meet the water quality standard of Class II. Compared with 2004, UDS in 2016 was reduced by 85%, and the contribution of domestic sources on F. coli decreased 772 cfu L−1 and 1001 cfu L−1 over DP and WP, respectively, indicating that treatment of domestic pollution by the government had a significant effect on reducing the concentration of F. coli in Honghu Lake.

In general, the impact of point source pollution on Honghu Lake gradually weakened, and non-point source pollution resulting from rainfall become the most important factor affecting the water quality of Honghu Lake. The percentage contribution of various pollution sources to the water quality of Honghu Lake was shown in Supporting Materials Table S2. As shown in Table S2, unidentified sources (UIS) over all periods contributed to pollution in Honghu Lake for most water quality variables, ranging from 0.8 to 30.6%. Although correlation analysis identified pollution sources that resulted in fluctuations in water quality, long-term fixed pollution sources such as endogenous pollution could not be identified. The results of APCS-MLR indicated the existence of these pollutants and proved their important impact on the water quality of Honghu Lake. It is recommended that future studies conduct more systematic research on UIS and endogenous pollution.

Conclusions

The present study applied a variety of analysis methods to the combined water quality data of Honghu Lake and pollution discharge data of the Honghu Lake Basin to systematically analyze the water pollution of Honghu Lake. The water period division method based on CA categorized the months of each year in two water periods according to water depth, with June–October representing the WP and November–May representing the DP. The CCME-WQI further found that the water quality of Honghu Lake showed an overall improving trend from 2004 to 2011, whereas the results for 2012 to 2017 showed an overall declining trend.

The main drivers of different water quality indicators were identified by correlation analysis. Generally, pH was affected by aquaculture, whereas CODMn and BOD were mainly affected by industrial pollution and domestic pollution, respectively, and NH3-N, TP, and TN were mainly affected by enclosed aquaculture and rainfall. PCA identified a total of four and seven types of pollution categories in the DP and WP, respectively, in 2004, 2012, and 2016, revealing the complexity and variability of pollution categories in Honghu Lake. Receptor-based source apportionment through APCS-MLR showed the differences in source apportionment over these 3 years. By comparing the difference of source contributions in different periods, the effect of pollutant reduction was quantified. Based on the results of source appointment, it was predicted that the concentration of most water quality indicators in Honghu Lake would meet the water quality requirements after 2017, but rainfall non-point source pollution must be controlled in the future to make TN concentration achieve the desired level. The research results obtained in the present study affirmed the effect of management of point source pollution and aquaculture enclosures on improving water quality and also provided a reference for further water quality management of Honghu Lake.