Introduction

Exploring the mechanism of water environment variability related to the water security is an important global issue on efficient river basin management. A general consensus states that water quality deterioration becomes more significant as the urbanization accelerated around the world. Such situation is especially evident in China, where the water quality problems are becoming a key limiting factor to the socio-economic sustainable development. As one of the seven large river basins in China, water shortages and water quality problems in the Huai River Basin (HRB), which were caused by the rapid economic development, increased water consumption and aggravated water pollution since the 1980s, which have attracted lots of attention (Liu and Xia 2004; Zhao et al. 2010; Zhang et al. 2013; Zhai et al. 2014; Dou et al. 2015). In particular, frequent occurrence of water pollution incidents had greatly influenced the industrial and agricultural production and urban water supply in the HRB during the 1990s. After then, the Chinese government has expanded many efforts, such as the adjustment of economic structure, industrial pollution control, sewage treatment plants, and agricultural non-point source pollution treatment, to improve the water quality in the HRB since the 2000s. As the China Environment Bulletin (2014) showed severe water pollution problems existed in approximately half of monitoring stations. Therefore, an overall analysis of the spatio-temporal variability of water quality during the past several years is of great significance to provide the basis for sustainable water environment management.

The methods in analyzing the trends of water quality can generally be divided into two groups: (1) modeling the future trends and (2) detecting the past trends of water quality based on the observed time-series data (Huang et al. 2012). Moreover, many statistical techniques have been undertaken to clarify the potential influencing factors of the variability in the water quality. It is noteworthy that water quality data has several particular characteristics (Hirsch and Slack 1984), e.g., water quality data series are often rendered non-normal distribution; the seasonality in the water quality data; water quality is correlated to the size of the flow, etc. Therefore, although parametric and non-parametric tests both can be applied in the detection of water quality trends, the later methods with few assumptions about data structures become more and more popular after considering the inherent characteristics in the water quality data series. In consideration of the seasonality existing in the water quality parameters (Chang 2008), the seasonal Mann–Kendall test, which is a non-parametric test method modified by Hirsch et al. (1982), was widely used to detect the trend of long-term water quality data (Zipper et al. 2002; Djodjic and Bergström 2005; Boeder and Chang 2008), e.g., Hirsch et al. (1991) observed an increase trend in total phosphorus concentrations, dissolved solids concentrations, and sulfate concentrations during 1972–1989 at Apalachcola River, Florida. Robson and Neal (1996) suggested a significant increase in the dissolved organic carbon in Plynlimon, mid-Wales. Amini et al. (2016) found a monotonic trend in the groundwater resources for one city and 11 villages in Larestan and Gerash during the study period, and Bouza-Deaño et al. (2008), Chang (2008), and Zhai et al. (2014) also investigated various water quality parameter trends by employing the seasonal Mann–Kendall test.

Spatial autocorrelation is a common phenomenon in the complicated spatial data analysis, and it often occurs when the variables are similar with each other at nearby sites (Tobler 1970). The identification of the spatial autocorrelation is an important preliminary step for the spatial analysis (Sokal and Oden 1978a, b; Dormann et al. 2007; Dale and Fortin 2014) and can provide useful information for recognizing the variation process and identify the existed structures of water quality variables (Brody et al. 2005; Dormann et al. 2007). One of the efficient methods to determine the spatial patterns of water quality is the Moran’s I method, which has been frequently used to analyze the distribution and structure of water quality variables and measure the non-point-source’s spatial autocorrelation (Chang 2008; Zhai et al. 2014). On the other hand, understanding the relationship between the river basin characteristics and water quality variations is also important. The regression analysis based on GIS was proved to be an effective statistical method to diagnose the contributions of anthropogenic activities and natural impact factors for water quality variations (Steele and Jennings 1972; Mueller et al. 1997; Antonopoulos et al. 2001; Simeonov et al. 2003; Plummer and Long 2007). Some studies have already emphasized the influence of watershed characteristics on water quality using the measured data under different scales (Chang 2008; Boeder and Chang 2008).

In this study, we conducted the spatio-temporal statistical detection of water quality variation in the upper and middle reach of HRB, which is the sixth largest river in China and have been a serious polluted problem in aquatic environment. A multi-scalar method was employed to diagnose water quality trends under different scales, i.e., subbasin and buffer scale. The variables of water quality, land use, and some other anthropogenic activity variables (such as point source emission, water temperature, and regulated flow) derived from the multi-scale were assumed to be independent in our analysis. The major objectives of this study were to (1) identify water quality trends at different scales using temporal analysis method (seasonal Mann–Kendall test) and spatial analysis method (Moran’s I methods), (2) explore the spatial and temporal variations of water quality affected by anthropogenic actives and natural factors, and (3) compare the variation of water quality over time on various influencing scale (500 and 1000 m buffer zone versus whole basin). This research will serve to recognize water quality variation at different scales and provide a scientific basis for efficient water quality management.

Materials and methods

Study area

Huai River (30° 55′∼36° 36′ N, 111° 55′∼120° 45′ E), located in eastern China between the Yangtze River Basin and the Yellow River Basin (Zuo et al. 2015), flows from the Tongbai Mountain of Henan province in the western to Hubei, Anhui, Shandong, Jiangsu provinces in the eastern, and into the Yangtze River at Sanjiangying of Jiangsu. (Fig. 1). The HRB encompasses 270,000 km2 accounting for 3.5 % of the national land area, which is an important agricultural base with the farmland approximately 12,192 km2 and agricultural population accounts for 86.1 %. HRB belongs to the water-scarce areas with average annual rainfall and discharge volume of runoff 911 mm and 45.2 billion m3 from 1956 to 2000, respectively. The population and total amount of water resources in HRB account 16.2 and 3.4 % of China, respectively. Especially for the water use-to-availability ratio (about 60 %), it exceeds the international average level of the rational development and utilization internationally. The main land uses are dry farmland (62 %) and paddy (20 %), followed by forest, grass, water, and urbanization. As a result of the unique climatic conditions, water morphology and topography, the floods and droughts in HRB occurred frequently during the past years. To help control the flooding and relieve the water shortages in HRB, approximately 11,000 water projects were constructed by 2000, including water reservoirs and weir sluices, which largely change the natural runoff. Besides, a large number of untreated industrial wastewater and domestic sewage directly poured into the river, which resulted in the ecological environment deterioration. Natural water chemistry characteristics exists significant differences among the regions, which are subject to the impact of human activities.

Fig. 1
figure 1

Location of study area, monitoring sites of water quality, dams and floodgates and sewage outlets in HRB

Data

Monthly water flows, water chemistry data and sewage discharge data for 17 monitoring sites are between 2000 and 2014. These used sites are distributed at two tributaries including Shaying River and Guo River (seven stations and four stations, respectively) and the main stream of Huai River (six stations). All of the data were provided by the Huai River Water Resources Commission. Four commonly used water quality indices, including DO, NH3-N, CODMn, and TP, which were determined by following the national uniform standards for water quality (HJ 506–2009; GB 11892–1989; HJ 535–2009; GB 11893–1989), were considered in our study. These indices have also been employed by many previous studies to analyze the variability of water quality in HRB (Zhao et al. 2010; Zhang et al. 2013; Zhai et al. 2014; Dou et al. 2015). The detailed information of these data and sites were described in Table 1. Besides, monthly water discharges data during 2000–2012 were collected from the Huai River Hydrographic Bureau. We also calculated the mean annual water quality during two subperiods, i.e., 2000s (2005–2009) and 2010s (2010–2014), to reduce the potential effect of interannual and hydroclimatic variability for the spatial regression analysis. As the distribution of the original data (DO, NH3-N, CODMn, TP) were positively skewed, they were log-transformed to check data outliers or remove from statistics analysis.

Table 1 Statistical parameters of monthly values from 17 sampled monitoring stations

The data of digital elevation model (DEM) was downloaded from the SRTM 3s Digital Elevation Database provided by USGS/NASA (Ma et al. 2014). The land cover data set in 2004 and 2009 were collected from the Global Change Parameters Database of Chinese Academy of Sciences (available on http://globalchange.nsdc.cn). Land cover can be divided into six major land types—farmland, forest, grassland, waters, urban blocks, and unused land (lands with exposed soil, sand, rocks, or snow and never has more than vegetated cover during any time of the year). Surface elevation and mean slope were derived from the DEM. In order to explore the influence of anthropogenic and natural factors on water quality variability, the land cover changes were analyzed at various spatial scales, i.e., subbasin, 500 and 1000 m buffer scales in 2004 and 2009.

Methodology

GIS analysis

All of the analysis of land use change, topography characteristics, and monitoring stations location were analyzed by ArcGIS 10.2 Desktop GIS software (ESRI 2013). The digital format of all datasets was unified into the common coordinate system (Gauss projection coordinates). Two different spatial scales (i.e., subbasin scale and buffer scale) were considered to explore the relationship between water quality variability and land use and topography in this study. Figure 2 shows the percentages of different kinds of land cover at the two foregoing spatial scales at 2004 and 2009 in the HRB.

Fig. 2
figure 2

Land use compositions of each land cover type in the a subbasin, b 500 m and c 1000 m buffer scale for 17 monitoring stations in 2004 and 2009 (which showed in the left and right column, respectively)

For the basin scale, the boundary of the subbasins was delineated by means of the DEM data in the ArcGIS, and each monitoring station was deemed to be the outlet of the corresponding subbasins. On the other hand, for the buffer scale, each water quality monitoring station was set as geographical buffer centers. In this study, two different buffer widths, i.e., 500 and 1000 m, were employed to classify the hydrologic units boundaries. These two chosen buffer widths were also used in other studies (Sliva and Williams 2001; Chang and Carlson 2005; Li et al. 2009, 2013; Zhao et al. 2015; Amuchástegui et al. 2016).

Statistical analysis

The seasonal Mann–Kendall test, which is a robust and non-parametric test method, and proved to be more suitable for the trend analysis for the variables with seasonality, was used to diagnose the temporal trends of water quality parameter in our study (Lettenmaier et al. 1991; Helsel and Hirsch 1992; Chang 2008; Boeder and Chang 2008; Buendia et al. 2016). In addition, median slope of each user defined season was used to estimate the variation magnitude.

According to the seasonal Mann–Kendall test, the null hypothesis H 0 of randomness defines that the data (x 1,  ... , x n ) is a sample of n independent and identically distributed random variables (Chen et al. 2006), which any seasonal but otherwise trend-free process will not be violated. Let X = (X 1, X 2,  ... , X n )T and \( {X}_i=\left({x}_{i1},{x}_{i2},\dots, {x}_{i{n}_p}\right) \). Where X and X i are the observed water quality sample and subsample series respectively, and X i contains the n i annual values from month i. The statistic Si for each month is defined as follows:

$$ {S}_i={\displaystyle \sum_{k=1}^{n_i-1}{\displaystyle \sum_{j=k+1}^{n_i}\mathrm{s}\mathrm{g}\mathrm{n}\left({x}_{ij}-{x}_{ik}\right)}}\left(1\le k<j\le n\right) $$
(1)

The SMK test statistic is \( S={\displaystyle \sum_{i=1}^n{S}_i} \); under the null hypothesis, we have E(S) = 0, \( Var(S)={\displaystyle \sum_j{\sigma_j}^2+}{\displaystyle \sum_{\begin{array}{l}j,k\\ {}j\ne k\end{array}}{\sigma}_{jk}} \). Among the month i, the variance compared from the observed series can be calculated as follows:

$$ {\sigma_j}^2=\frac{n_j\left({n}_j-1\right)\left(2{n}_j+5\right)}{18} $$
(2)
$$ {\sigma}_{jk}=\frac{H_{jk}+4{\displaystyle \sum_{i=1}^n{R}_{ij}{R}_{ik}-n\left({n}_g+1\right)\left({n}_h+1\right)}}{3} $$
(3)
$$ {H}_{jk}={\displaystyle \sum_{i<m}\mathrm{s}\mathrm{g}\mathrm{n}\left[\left({x}_{mj}-{x}_{ij}\right)\left({x}_{mk}-{x}_{ik}\right)\right]} $$
(4)

The standard normal deviate Z (standardized statistic) follows a standard normal distribution and is expressed as follows:

$$ Z=\left\{\begin{array}{r}\frac{S-1}{\sqrt{Var(S)}},S>0\\ {}0,S=0\\ {}\frac{S+1}{\sqrt{Var(S)}},S<0\end{array}\right. $$
(5)

In this study, we select the significance level α at 0.05 and 0.10 in a two-sided test for water quality parameter trend, the null hypothesis should be accepted if |Z| < Z α/2 = 1.96 and 1.65, respectively, where FN (Zα/2) = α/2, FN being the standard normal cumulative distribution function, and a being the size of the significance level for the test (Hirsch et al. 1982). When the value of S is positive, it represents an upward trend. In contrast, it would be a downward trend. Furthermore, median slope indicates the magnitude of water quality trend (Helsel and Hirsch 1992). The seasonal Kendall slope can be calculated by d ijk  = (x ij  − x ik )/(j − k) for (1 ≤ i < j ≤ n i ), where d is the slope, x denotes the variable, n is the sample size, and i, j are indices.

Spatial autocorrelation analysis is an important field of spatial statistics research, and it is also one core method for studying the distribution association among geographic units. Global indices of spatial autocorrelation have been widely used to evaluate the degree to which similar observations tend to occur near each other. As a widely used spatial autocorrelation index, Moran’s I (Moran 1950) reflects the spatial dependence degree of different variables and can be expressed as follows:

$$ I=\frac{n}{{\displaystyle {\sum}_{i=1}^n{\displaystyle {\sum}_{j=1}^n{W}_{ij}}}}\frac{{\displaystyle {\sum}_{i=1}^n{\displaystyle {\sum}_j^n{W}_{ij}\left({X}_i-\overline{X}\right)\left({X}_j-\overline{X}\right)}}}{{\displaystyle {\sum}_{i=1}^n{\left({X}_i-\overline{X}\right)}^2}} $$
(6)

Although useful for determining overall patterns of a particular dataset, the Global Moran’s I statistic falls short when examining the relationships between sites of that dataset, and this shortcoming was addressed with the development of a Local Moran’s I analysis that yields further information about where the patterns of autocorrelation exist within the occurrences of interest (Anselin 1995). An important distinction between the analyses is that Local Moran’s I is disaggregated and therefore examines the degree to which neighboring data points are similar or dissimilar. The spatial autocorrelation differences of different objects can be represented by the local Moran’s I index and Moran scatter plot. The local Moran’s I is expressed as follows:

$$ {I}_i=\frac{n}{{\displaystyle {\sum}_{j=1}^n{W}_{ij}}}\frac{{\displaystyle {\sum}_{j=1}^n{W}_{ij}\left({x}_i-\overline{x}\right)\left({x}_j-\overline{x}\right)}}{{\displaystyle {\sum}_{j=1}^n\left({x}_j-\overline{x}\right)}} $$
(7)

Where, X i and X j represent water quality parameters monitored from station i and station j, respectively. X and W ij are the average value of water quality and the weight matrix, calculated by the inverse distance of station i and j. Moran’s I value ranges from −1 to 1. Positive values of I are associated with strong geographic patterns of spatial clustering, negative values of I are associated with a regular pattern, and a value close to zero (I = 0) represents complete spatial randomness (O’Sullivan and Unwin 2003). The spatial autocorrelation indices (I) compute the standard normal variate Z by

$$ Z=\frac{I-E(I)}{\sqrt{Var(I)}} $$
(8)

Where E(I) and Var(I) are the mean and variance of spatial autocorrelation indices.

Since the Z-score results reported from Moran’s I analysis are in essence the slope of a regression line derived from the scatter plot of the differences between data points and the CO-type reported is determined by which quadrant (e.g., “High-High (H-H)”, “Low-Low (L-L)”, “High-Low (H-L)”, or “Low-High (L-H)”) (Anselin 1996). The significance level α is selected for the test, if Z > Zα/2, it indicates the monitoring station is a significant high or low concentration cluster center for water quality parameters, namely represents cluster pattern the “H-H” or “L-L”. If Z < −Zα/2, the data of the monitoring station illustrates an outlier (“H–L” or “L–H”). Otherwise, regional spatial autocorrelation of variables is not significant and shows a random distribution.

The data of land use and water quality from the subbasin and buffer scale are used to identify the relation between them by the regression analyses. The multiple linear regression used widely in other similar studies (Joarder et al. 2008; Yan et al. 2013; Yu et al. 2013) is used to identify the impacts of anthropogenic interventions and natural factors on the variation of water quality, percentage of land cover at different scales, elevation and mean slope, water quality load, regulated flows and water temperature are the independent variables. The predictor indices and the response variables (influencing factors on water quality) are log-transformed before regression analysis. The forms of spatial error models are used from the formula: Y i  = X i β i , ε = λW ε  + ξ, where, Y i and X i are the dependent variable at location i, β i the regression coefficient, ε the random error terms, λ the autoregressive coefficients of the spatial error model, W ε the spatially lagged error term, and ζ the homoskedastic and independent error term. Three extensively used indicators (tolerance: Tol, variance inflation factor: VIF, condition index: CI) for describing the multicollinearity degree are employed in the regression diagnostic analysis (Zhai et al. 2014). More detail descriptions about fitness and statistical significance test of the regression function can be seen from these literatures (Wherry 1931; Velleman and Roy 1981; Asterious and Hall 2011).

Results

Spatio-temporal variation trends of water quality

Temporal trends

Figure 3 shows the temporal trends of the four water quality elements given by the seasonal Mann–Kendall test at the 17 stations during 2005 to 2014. We can observe significant increasing trends, which relates to the improving of water quality (range 0.35–2.51 %/year) for DO at the Mengcheng station in the Wo River, Bengbu and Wujiadu in the Huai Mainstream, Huaidian, Zhoukou, Jialuhe and Shahe in the Shaying River, whereas significant decreasing trend at Jieshou station in the Shaying River (range 0.23–1.23 %/year). For NH3-N, all stations show decreasing trends, with most of them being significant (range 0.12–3.68 mg/L/year). These stations with insignificant decreasing trends are distributed in the Huai Mainstream. Similar results can be showed from the trends analysis for CODMn. There are approximately 60 % stations exhibit significant decreasing trends (range 0.52–12.75 mg/L/year) with extra 30 % stations showing insignificant downward trends in the Wo River and Huai Mainstream. TP shows decreasing trends in most stations with 50 % of them being significant (range 0.08–0.33 mg/L/year). It is noteworthy that the three sites in the lower reach of the Huai Mainstream presents an increasing trend. The decreasing of NH3-N, CODMn, and TP means the improvement of water quality; thus, the trends in these three elements all indicate an improvement of water quality in most areas in the HRB during 2005–2014.

Fig. 3
figure 3

Trends in water quality for a DO, b NH3-N, c CODMn, d TP, 2005–2014

Spatial variation of water quality for the 2000s and 2010s

Figure 4 and Fig. 5 provide the spatial variation of the four water quality elements during the divided two subperiods (2000s and 2010s). As shown in Fig. 4a and Fig. 5a, DO condition is better in the Huai Mainstream and the upper reach of the major tributaries, with higher average values of DO concentration in these areas. The comparison between the two subperiods shows that the DO concentration values increase about 14.1 % from 2000s to 2010s for all the stations as a whole. In detail, about 88 % of the stations shows an increase in the DO concentration during the two subperiods, while two stations present a reduction change (Fuyang decreases 14.3 % and Woyang decreases 4.6 %). Among all the stations giving an increasing trend, Haozhoutielu, Luhufuqiao and Bengbu have a higher changes with the increase rate of 33.4 %, 25.1 % and 21.9 % respectively. Figure 4b and Fig. 5b show that the NH3-N concentration is higher in the major tributaries than the Huai Mainstream. The reduction of NH3-N concentration in all stations indicates an improvement of the water quality. The average decreasing rate is 53.0 % from 2000s to 2010s for all the stations as a whole. Furthermore, the average reduction rates of the Huai Mainstream, Shaying River, and Wo River are 31.5, 63.3, and 67.1 %, respectively. Particularly, the most obviously improvement is observed in the Luhufuqiao station with the rate of −85.2 % during the two subperiods. A similar spatial extent can be observed between the CODMn and NH3-N, i.e., high values in the major tributaries and low values in the Huai Mainstream of the HRB. For the stations (accounting for 76.5 % of all stations) showing decreasing trends of CODMn between the two subperiods, the average reduction rate is 17.9 %, while the other four stations showing increasing trends increase by 26, 4.8, 12.5, and 16.1 % at Lutaizi, Fuyang, Yingshang, and Mengcheng, respectively. The average reduction rate of the CODMn concentration in the Huai Mainstream, Shaying River, and Wo River are 7.8, 16.5, and 26.0 %, respectively. For TP, water quality is improved in 47.1 % stations, and the most obvious change is in the Xiaoliuxiang with the reduction rate of 55.9 % followed by Jialuhe (55.9 %) and Haozhoutielu (44.6 %) from the 2000s to 2010s. It should be noted that the percentage of stations showing increase in the TP concentration (52.9 %) is a liter higher than that with decreasing changes. For example, TP concentration increases by 94.6 and 54.1 % in the stations Menfcheng and Fuyang, which means that the water quality in these stations is deteriorated.

Fig. 4
figure 4

Spatial trends of water quality for a DO, b NH3-N, c CODMn and d TP during the 2000s in the HRB

Fig. 5
figure 5

Spatial trends of water quality for a DO, b NH3-N, c CODMn and d TP during the 2010s in the HRB

Spatial autocorrelation of water quality trends

Global spatial autocorrelation analysis

Table 2 shows a weak and moderate positive spatial autocorrelation in NH3-N and CODMn with the values of Moran’s I varying from 0.23 to 0.31 and 0.23 to 0.34, respectively. The Moran’s I values of DO also indicate a weak positive spatial autocorrelation for the whole period and 2010s, while there a weak and negative spatial autocorrelation in the 2000s. Similarly, TP varies randomly through space with different types of spatial autocorrelation that can be observed in different periods.

Table 2 Moran’s I values and standardized statistic Z scores of water quality trends form 2005 to 2014 and in the 2000s and the 2010s for 17 sites in HRB

In addition, the increase in Moran’s I values of CODMn, DO and TP between the 2000s and the 2010s can be possibly caused by the moderation of external disturbances while the declines in NH3-N indicates that the localized problems of water quality should be carefully further concerned. DO and TP with lower Moran’s I values, indicate existed spatial heterogeneity in watershed characteristics might contribute the variations of DO and TP concentrations. It is possible to cause the increases by regional anthropogenic interventions or natural impact factors such as a large amount of the dams and floodgates contracted in HRB, pollution emissions into the rivers and extreme events.

Local spatial autocorrelation analysis

Once the Global Moran’s I analysis is completed and reveals a high degree of clustering within the dataset, the Local Moran’s I analysis is performed to further examine the nature of individual relationships between the data points. Table 3 shows the p-score results from Local Moran’s I analysis and reveals areas of clusters with the significance level greater than 95 % determined by the p-test. Two cluster centers with significant high concentrations can be detected for DO and TP in the 2010s at Mengcheng and Huaidian, respectively, while four cluster centers with significant low concentrations for DO at the stations Wangjiaba and Huaidian in the 2010s and for CODMn in the whole period and 2010s at Shahe. Besides, one LH and two HL outlier can be detected for TP at Wangjiaba in the 2010s, and at Shahe for the whole period and 2000s, respectively. The results in Table 3 indicates that no significant autocorrelation exists in NH3-N series for all the three periods. The significant HL outlier for TP at Shahe in 2000s changes into the significant LH outlier and moves downstream towards the Wangjiaba station in 2010s.

Table 3 The significance level of local spatial autocorrelation analysis

Relation of water quality and anthropogenic activities and topography

Pollution emissions

To quantify the magnitude and significance level of the relationship between anthropogenic intervention factors and three water quality parameters NH3-N, CODMn and TP, the multiple regression analysis, which is an effective and wildly used method is employed in this study. The results of multiple linear regression models between the concentration of each water quality element and the flow, water quality load and water temperature, monitored at four stations (Lutaizi, Wangjiaba, Fuyang and Mengcheng) of the HRB from 2000 to 2012 are given in Table 4. Table 4 shows that the multicollinearity is not serious among Tw, Q, and water quality load series when Tolmin (the minimum tolerance) > 0.14, VIFmax (the maximum variance inflation factor) < 9 and CImax (the maximum condition index) < 30. The statistic results in the Durbin–Watson and Breusch–Godfrey test for the residuals indicates that the regression models exhibits significant uncorrelated (P < 0.05) for the stations of Lutaizi, Wangjiaba, and Fuyang, while insignificant uncorrelated for TP series in the Lutaizi station. Therefore, it is unbiased and effective for the coefficients of regression models.

Table 4 Water quality (NH3-N, CODMn and TP) estimated from multiple linear regression models

From the columns R and Radj 2, all the correlation coefficients and adjusted determination coefficients are greater than 0.70 and 0.32, respectively, except for TP series at Lutaizi station, which suggests a good fit between the regression models and the observed data. Furthermore, Tw, Q and load of water quality are significantly correlated to water quality variation. Tw is significantly (P < 0.05) correlated to NH3-N changes at Lutaizi station, NH3-N’s load changes (P < 0.05) and Tw changes (P < 0.10) for Wangjiaba, NH3-N’s load and Tw changes (P < 0.05) for Fuyang and Q (P < 0.05), and NH3-N’s load (P < 0.10) for Mengcheng. On the other hand, CODMn’s load is significantly (P < 0.05) correlated to CODMn variation at Lutaizi station, while Tw (P < 0.05) for Wangjiaba, CODMn’s load and Q (P < 0.10) for Fuyang and CODMn’s load, and Q and Tw (P < 0.05) for Mengcheng. Similarly, TP’s load is significantly (P < 0.05) correlated to TP variation at Wangjiaba station, while TP’s load and Q (P < 0.05) for Fuyang and TP’s load and Q (P < 0.05) and Tw (P < 0.10) for Mengcheng.

Land cover and topography

To investigate the influence of land cover and topography on water quality variations, the spatial regression models (Table 5) in the 2000s and 2010s at various scales (subbasin, 1000 m buffer and 500 m buffer) are established to identify the significant explanatory variables for NH3-N, CODMn and TP, respectively. The regression diagnostic shows that multicollinearity is not serious among land cover and topography variables (Tolmin > 0.1, VIFmax < 20, CImax < 30). Besides, the residuals from the regression models are considered to be uncorrelated at the significance level of 0.05 or 0.10. Thus, the estimated coefficients in the established regression functions are perceived as unbiased and effective.

Table 5 Water quality (NH3-N, CODMn, and TP) estimated from spatial regression models

The 18 spatial regression models in Table 5 indicates that the topography variables (elevation and slope) are positively correlated to NH3-N at the 0.05 significance level while negatively correlated to CODMn at the 0.1 significance level. The predictive ability of regression models for NH3-N and CODMn generally declines from the 2000s to 2010s as shown in lower R 2 values in the 2000s. Similar conclusion can also be summarized as the scale decreases. This suggests that other factors that have not been included in the regression models have become important to explain the variation of water quality. At the subbasin scale, urban land cover is the primary predictor for NH3-N and CODMn. As for NH3-N, the second significant factor is farmland at the subbasin scale and 500 m buffer scale while is the farmland land cover at the 1000 m buffer scale. The positive sign of the coefficients between the farmland and NH3-N suggests that the agriculture development is the driving source of nutrient concentrations. As for CODMn, the topography variables, farmland, and forest land cover are the main explanatory variables. Farmland and forest exhibit negative coefficients for CODMn. In the 2000s, the variation of NH3-N and CODMn at the subbasin scale can be partially explained by the land cover of urban, elevation, and forest. At the 1000-m buffer scale, farmland, forest and waters can explain the variation in NH3-N and CODMn in 2000s. At the 500 m buffer scale, elevation and slope are the significant factors to explain the variation of NH3-N and CODMn. As for TP, slope and forest exhibit negative coefficients in both periods at different spatial scales. At the 500 m buffer scale, farmland is the primary predictor for TP.

Discussions

Spatio-temporal statistical analysis

Due to the rapid development of regional socio-economic in the HRB, anthropogenic activities including water consumption and pollution emission control play an important role in the water quality improvement, and result in large spatial variability in water quality parameters.

Trend analysis results for DO, NH3-N, CODMn, and TP indicate a significant improvement of water quality conditions from 2005 to 2014, and the stations with the improvement of water quality accounts for 88.2 % (15/17), 100 % (17/17), 82.4 % (14/17), and 47.1 % (8/17) of all the sample sites for the four water quality elements, respectively. For each element, the Huai Mainstream and the upper reach of the major tributaries had better DO condition, while the major tributaries had higher NH3-N and CODMn concentration than the Huai Mainstream. Besides, it indicated that the water quality of Huai Mainstream was deteriorated if only the TP indicators were considered. Lots of pollution control measures including industrial pollution control, sewage treatment plants, and agricultural non-point source pollution control taken by local government since 2000, contributed to the significant improvement of water quality in the HRB. By the implementation of the 10th and 11th Five-Year Plan in China, the total amount of pollution emission declined to 4.7 billion t, which contained the total discharge of pollutant emission of ammonia factors and chemical oxygen reduced to 1.042 million t and 140,000 t, respectively. On the other hand, the dam operation might also be the potential caused to the improvement of water quality conditions since that the HRB is a highly regulated river basin with large amounts of dams and floodgates.

Influence of pollution emissions

As two important influential factors contributing to water quality concentration increasing, industrial pollution accounts about 60 % of urban sewage and waste emissions, and point source emission is mainly resulted from the unreasonably heavy use of pesticides and fertilizers flowing into the rivers and significantly increasing the total amount of NH3-N, CODMn, and TP elements. Water quality can be affected by the hydrological variables (water flow), which contribute to the migration and transformation of pollutants in the river, and the water environmental variables (water temperature), which indirectly reflect the degree of the pollutants degradation in water body. Similar analysis can also be seen in the previous studies (Buck et al. 2004; Chang and Carlson 2005; Dou et al. 2013; Zhang et al. 2013; Zhao et al. 2015).

Influence of land cover and topography

The regression analysis results between water quality and land cover in this study suggested that the land cover types could influence the water quality parameters (NH3-N and CODMn) at different degrees. We found that urban land cover was significantly and positively correlated to the NH3-N and CODMn variations at subbasin scale, and our results were consistent with most previous researches (Ahearna et al. 2005; Haidary et al. 2013; Wan et al. 2014). We can also infer that the accelerating urbanization process in China and the huge changes of underlying since 2000 might also be the causes for the changes in water quality conditions. At subbasin scale, farmland and urban blocks were the main source of non-point source pollution in the HRB compared to the other land use types, agreeing that NH3-N and CODMn came primarily from agricultural and urban blocks land uses (Jones et al. 2001; Sonoda et al. 2001). Besides the farmland and urban blocks, industrial and domestic factors also can contribute to affect the water quality conditions (Zhao et al. 2015). However, the relationships between these factors and water quality elements (NH3-N and CODMn) were only significant at the subbasin and 1000 m buffer scale. In our study, we can also observe positive coefficients between industrial land use and water quality which was opposite to the result given by Sonoda et al. (2001).

This study performed a negative coefficient between slope and NH3-N and CODMn in both periods at the buffer scale. That is to say, water quality concentration decreased as the slope variability increased. In common, the concentration of dissolved oxygen containing in the water body will be higher if water in rivers flows faster (Chang 2008) and contribute to declining the water quality concentration. However, Pratt and Chang (2012) suggested that gentle slope could slow water movement, and it contributed to mixing pollutants and provided a longer time to oxidize and decompose them. Similar results that increased slope variability positively correlating to the increases of water quality concentrations could be seen from the study of Richards et al. (1996) and Sliva and Williams (2001).

Variations linking with scales

Our analysis demonstrated that the buffer size play an important role on the significant relations for different types of land cover and water quality indicators. The conclusions in our study agreed with the previous studies, which indicated that the bigger scale drainage area had a more significant influence than the smaller scale buffer (Sliva and Williams 2001; Nash et al. 2009). Our study area has large area (270,000 km2) and population density, and it is known that water quality variables are influenced by the scale of land cover assessment. As shown from the results, urban blocks and farmland land cover were the significant factors that could explain NH3-N and CODMn variations at larger scales. Diffusion sources emissions from agricultural land, industries, and domestic sewage pouring into the rivers distantly can increase the water quality concentration.

Therefore, the plans including the diffusion sources emissions control, industrial restructuring, and pollution control projects should be taken by local government and focus on the large scale. Besides, forest, grassland, and waters are benefit to improve water quality conditions at larger scales. However, the composition proportion of forest, grassland, and waters within HRB are currently only about 10, 1, and 1 %, respectively, and it still needs to further improve the land cover conditions consistent with the world average level.

Conclusions

This study aims to detect the spatio-temporal variation of water quality, identify these important influence factors of water quality variations, and reveal the influence mechanisms of land use on water quality variations, along with scale-process interactions on the variations at multiple scales in the HRB. This study will provide a basis for water pollution control, water environment protection and ecological restoration of the HRB. The conclusions are as follows:

  1. (1)

    This study showed a decreased trend for NH3-N and CODMn parameter, while diverging trends for DO and TP parameter. NH3-N and CODMn concentrations exhibited decreasing trends at all stations with 76.5 and 60 % of them were significant, respectively). TP showed significant decreasing trends for half stations with increasing trends in extra three stations distributed in the Huai Mainstream. In addition, DO concentrations exhibited significant increasing trends for 50 % stations and only the Jieshou station showed significant decreasing trend. Overall, the analysis demonstrated significant improvement of water quality during 2005–2014 in the HRB.

  2. (2)

    There was a weak and moderate positive spatial autocorrelation for water quality parameters NH3-N and CODMn while DO exhibited a weak positive spatial autocorrelation for the whole period and in 2010s. Two cluster centers of significant high concentrations were detected for DO and TP at Mengcheng and Huaidian respectively, while four cluster centers of significant low concentrations for DO at Wangjiaba and Huaidian in the 2010s. The control measures of point source emissions could explain the spatial patterns appearance, and these local management efforts for improving water quality conditions included a large amount of regulated dams and floodgates, economic structure adjustment and sewage treatment plants and so on.

  3. (3)

    Multiple regression models were used to explain the relationship between water quality parameters and environmental variables. Water temperature, regulated flow, and load of water quality variables exhibited a significant correlation to water quality variation. Water quality variations could be determined differently from each station for each water quality parameters. Urban land cover was the primary predictor for NH3-N and CODMn at larger scales. The predictive ability of regression models for NH3-N and CODMn declined as the scale decreases or the period ranges from 2000s to 2010s. Topography variables of elevation and slope exhibited positive and negative correlations to NH3-N and CODMn, respectively.

  4. (4)

    Water quality variations can be affected by many factors (anthropogenic activities or natural factors) such as point source pollution emissions, hydrological variables altered by intensive dams and floodgates, land use, topography and extreme events and so on. Future appropriate water quality management policy made by local government for water quality improvement should be based on the deeply understanding how anthropogenic activities, land use and natural factors impact on water quality variations and how scale affects the linkages over time and space.