Introduction

Chlorine is widely used in disinfection of drinking water because of its high efficiency in eliminating water borne pathogens, low cost and simple operation. However, chlorine reacts with the dissolved organic matter (DOM) in source water, leading to the formation of disinfection by-products (DBPs) (Richardson et al. 2007; Guilherme and Rodriguez 2015; Kim et al. 2015). To date, hundreds of DBPs have been identified in drinking water, among which the carbonaceous DBPs [e.g., trihalomethanes (THMs), haloacetic acids (HAAs)] are strictly regulated by many agencies due to their high levels occurred in chlorinated water and the potential health risks to human (Richardson et al. 2007; Salas et al. 2013). The concentrations of unregulated nitrogenous DBPs [e.g., haloacetonitriles (HANs)] have been reported to be much lower than THMs and HAAs in drinking water (Krasner et al. 2006; Chen and Westerhoff 2010; Xue et al. 2014), yet the cytotoxicity and genotoxicity posed by HANs may be similar or even higher as compared to that posed by THMs and HAAs (Muellner et al. 2007; Richardson et al. 2007). Accordingly, the nitrogenous DBPs have been receiving increasing concerns and attentions in recent years.

The DOM is regarded as the precursor for DBPs formation during chlorination. The dissolved organic carbon (DOC), dissolved organic nitrogen (DON) and UV absorbance at 254 (UV254) are commonly used to quantify the DOM level in source water (Nikolaoua and Lekkasa 2001; Pifer and Fairey 2014). While the specific UV absorbance at 254 nm (SUVA254), which is calculated by UV254 times 100 divided by DOC, is widely used to characterize the aromaticity of the DOM (Liang and Singer 2003). Generally speaking, the higher the DOM level, the higher the DBPs yields (Xie 2004). Moreover, properties of DOM also exert great influence on the formation of DBPs. The DOM with high SUVA254 may produce high yields of THMs and HAAs per unit carbon as compared to those with low SUVA254 (Uyak and Toroz 2007; Hong et al. 2009). Conversely, the DOM with high nitrogen content may contribute more to the formation of nitrogen DBPs (Nikolaoua and Lekkasa 2001; Fang et al. 2010; Hong et al. 2015). Besides the DOM level and its properties, other water quality parameters (e.g., bromide, pH) as well as the disinfection conditions (e.g., chlorine dose, reaction time, temperature) may also pose complex effects on DBPs formation (Hong et al. 2013b; Kim et al. 2015).

To better understand the relationships between the DBPs formation and the factors (including water quality and disinfection conditions), a series of regression models have been developed (Chowdhury et al. 2009; Chen and Westerhoff 2010; Abdullah and Hussona 2013). These models could be an effective way to identify the key factors influencing the DBPs formation and be a good alternative for evaluating DBPs formation in practice, which could alleviate the laborious work and reduce the expensive cost during DBP analysis (Hong et al. 2007; Chowdhury et al. 2009). Thus, the models have a wide range of applications in the design and management of drinking water supply system, in toxicological and epidemiological risk assessment studies and in risk-cost trade-off analyses (Chowdhury et al. 2009).

However, the models in available references mostly focused on THMs, HAAs, with very little attention to nitrogen DBPs, e.g., HANs (Chowdhury et al. 2009; Chu et al. 2012; Mukundan and Van Dreason 2014). According to the toxicological studies, the common HANs species such as dichloroacetonitrile (DCAN), bromochloroacetonitrile (BCAN) and dibromoacetonitrile (DBAN) are genotoxic and potentially carcinogenic to human (Bull and Robinson 1985; Daniel et al. 1986). Developing models on HANs is fundamentally important. On the other hand, due to the different source water quality and diverse disinfection conditions, DBPs formation may vary extensively. The site-specific regression models may over-predict or under-predict the DBPs formation (Sohn et al. 2004; Chowdhury et al. 2009). For example, Yoon et al. found that the predicted THMs levels using USA model were much higher than the measured levels for the source water in Korean (Yoon et al. 2003). Therefore, for a specific area, it is important to develop its own models to evaluate the DBPs formation and identify the key factors.

Tai Lake, Qiantang River and Jinlan Reservoir are three important source drinking waters in Yangtze River Delta, China (Hong et al. 2013b; Hong et al. 2015). Jinlan Reservoir is located in a remote area with limited human activity. The water in the reservoir had low DOC level (1.3 mg L−1) and low SUVA254 value (1.3 L mg−1m−1) (Hong et al. 2013b). Qiantang River, especially the downstream, received a great deal of human activity. Theoretically, originating from the degradation of higher plant tissues from the watershed and having high SUVA254, the allochthonous DOM should be the important composition of DOM for a typical river (Wetzel 2001; Hong et al. 2008). Yet due to the impacts from the sanitary, industrial and agricultural effluent, the water in Qiantang River contained high level of DOC (5.96 mg L−1) but with low SUVA value (0.794 L mg−1m−1) (Dong et al. 2012; Hong et al. 2015). Tai Lake is super-eutrophic and algal bloom occurred every year in recent decade. The DOM in Tai Lake was very high (DOC = 10.3 mg L−1) and also had low SUVA value (0.52 L mg−1m−1) (Hong et al. 2015). With the fast development of economy and improvement of living standards, DBPs formation in this area is increasingly becoming a public concern. However, except for a few studies on DBPs formation (Hong et al. 2013a, b; Hong et al. 2015), there is no available report related with THMs and HANs models for these waters. Moreover, to our knowledge, there are no available models exclusively for the water with low SUVA values. Therefore, generation of DBPs models using these waters may provide new knowledge to DBPs formation.

To address our objectives, a series of chlorination experiments were carried out using the source water collected from Tai Lake, Qiantang River and Jinlan Reservoir. Multiple regression models were generated for evaluating the formation of THMs and HANs, and the key factors were identified as well. In addition, the regression models developed in this study (DOM with low SUVA) were also compared with those from the reference (DOM with high SUVA).

Materials and methods

Field sampling and the water quality

The detailed information about the sampling site and the water quality have been described in our previously studies (Hong et al. 2013b; Hong et al. 2015) and were presented in Table 1. Tai Lake had highest organic level (DOC, DON and UV254), followed by Qiantang River and Jinlan Reservoir.

Table 1 Water quality of the source water

What worth concerning was that the bromide level in Tai Lake (248 µg L−1) was much higher than that in Qiantang River (73 µg L−1). This may be because (1) to alleviate the low water level in winter and the algal bloom occurred in summer, Tai Lake drew water from Yangtze River every year, which contained high bromide level due to the in burst of salt tide from estuaries (You et al. 2012). While for Qiantang River, the saltwater intrusions reduced to some extent due to the effective measures taken by the local government; (2) lake is a relative steady and independent system, the pollutants are difficult to be removed once they are introduced; yet river is a mobile system, and the contaminants can be diffused or diluted quickly.

Database description

The data of THMs (including the trichloromethane (TCM), dichlorobromomethane (DCBM), chlorodibromomethane (CDBM) and tribromomethane (TBM)) and HANs (including trichloroacetonitrile (TCAN), DCAN, BCAN and DBAN) for Jinlan Reservoir water originated from our previous studies (Hong et al. 2013a, b), while for Tai Lake and Qiantang River, the THMs and HANs data were obtained by chlorination tests according to the orthogonal design, which was the same as Jinlan reservoir (Hong et al. 2013b, S-Table 1). The design included the factors of chlorine dose (Cl2/DOC = 0.5, 1.5, 3.0 mg mg−1), pH (6, 7, 8), Br level (ambient (am), am+100, am+200 and am+400 µg L−1), reaction time (2, 24, 72 h) and temperature (10, 20, 30 °C), and only one parameter varied at a time, while other parameters were kept at the “baseline” condition (i.e. Cl2/DOC = 1.5 mg/mg, pH 7.0, time = 24 h, Br=am, and temperature = 20 °C).

Chlorination tests were carried out in a series of glass tubes. Each treatment had two replicates. After spiked with chlorine, the water samples were incubated in the dark for 2–-72 h. Determination of THMs and HANs was performed according to U.S. EPA551.1. Specifically, the residual chlorine in water sample was quenched with ammonium chloride. Then the water sample was adjusted to pH 5.0. After adding methyl tert-butyl ether (containing 1,2-dibromopropane as internal standard) and Na2SO4, the water sample was then shaken vigorously for two minutes. Finally, the supernatant was injected into GC-ECD system. The detection limits and the recovery rates for THMs (TCM, DCBM, CDBM and TBM) and HANs (TCAN, DCAN, BCAN and DBAN) are shown in S-Table 2.

Development of regression models

Step-wise multiple regression procedure in SPSS software (Version 16.0) was used to develop the models of DBPs, which included THMs (TCM, DCBM, total THMs (T-THMs)) and HANs (DCAN, BCAN, total-HANs (T-HANs)). Other DBPs were not included because their levels in some treatments were below the detection limit (S-Table 3–5). Before model generation, the DBPs concentrations, the water quality parameters (DOC, DON, UV254 and Br) and the chlorination factors (Cl2/DOC, time, temperature and pH) were transformed to logarithmic values. Then the logarithmic forms of THMs and HANs were defined as dependent variable (log10 Y); logarithmic forms of water quality and chlorination factors were defined as independent variables (log10 X i ). The independent variables (log10 X i ) enter into the equation in order of their partial correlations with the dependent variable (log10 Y). Thus, the most important factors are identified, and the regression models of log10 Y = a0 + a1log10 X 1 + a2log10 X 2….. + ailog10 X i were developed. Finally, the developed models were transformed to the equivalent equation of Y = 10a0 X a11 X a22 ….X aii .

Results and discussion

Regression models for THMs and HANs

THMs and HANs levels in chlorinated waters are summarized in Table 2. In terms of T-THMs and T-HANs, the highest levels were observed in Tai Lake, followed by Qiantang River, and the last was Jinlan Reservoir. But the mean T-THM levels were all within the range of DBPs standard regulated by China (Ministry of Health 2006).

Table 2 Mean ± SD and range (min.–max.) of THMs and HANs (µg L−1) from the chlorination tests

The multiple regression models of DBPs are presented in the following Eqs. (1)–(6). It shows that both T-THMs and T-HANs were positively correlated with DOC, Br, chlorine dose (Cl2/DOC), temperature (Temp) and reaction time (t) (Eq. 1, 4). These results were generally expected. The high DOC concentration means the high level of DBPs precursor, which may significantly enhance the formation of THMs and HANs (Nikolaoua and Lekkasa 2001). In this study, the contribution of DOC to T-THMs and T-HANs yields mainly derived from the DOC-dependent increase in TCM and DCAN (Eq. 2, 5), respectively. The bromide, which acts as the inorganic precursors during DBPs formation, was quickly transformed to hypobromous acid (HOBr) during chlorination. Similar to hypochlorous acid (HOCl), the HOBr can react with DOM leading to the formation of brominated DBPs (Symons et al. 1993; Hong et al. 2013b). Due to the more powerful substitution ability of HOBr and the higher atomic weight of bromine than chlorine (Symons et al. 1993; Xie 2004), the brominated DBPs (DCBM, BCAN) as well as the total DBPs (T-THMs and T-HANs) increased (Eq. 1, 3, 4, 6) but the chlorinated DBPs (TCM, DCAN) decreased (Eq. 2, 5) with the increase in bromide level. Moreover, formation of DBPs is a multi-stage process, and some intermediate by-products may form during chlorination (Bond et al. 2012; Shah and Mitch 2012). A suitable increase in chlorine dose and reaction time may result in a more complete reaction and produce more of the downstream products such as the THMs and HANs. Therefore, the observation that chlorine dose and reaction time had positive influence on THMs and HANs formation (Eqs. 16) was reasonable. In addition, HANs belong to the intermediate DBPs, too much chlorine or long reaction time may accelerate its decomposition and lead to the reduction of HANs yields (Xie 2004; Yang et al. 2007), but this situation did not occur in this study. The elevation of temperature can accelerate the reaction rate, which may result in the quick accumulation of the stable DBPs such as the THMs. On the other hand, the elevated temperature may accelerate the hydrolysis and decomposition of the unstable DBPs such as the HANs (Yang et al. 2007). Thus, the temperature-dependent increase in THMs (Eq. 1, 2, 3) was within the expectation. While for the yields of T-HANs and DCAN (Eq. 4, 5), temperature also showed positive effects, implying that the generation outweighed the decomposition.

$${\text{T-THMs}} = 10^{{^{ - 2.534} }} \left( {\text{Br}} \right)^{0.212} \left( t \right)^{0.305} \left( {\text{DOC}} \right)^{0.369} \left( {\text{Temp}} \right)^{0.662} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.400} \left( {\text{pH}} \right)^{2.364} \quad \left( {R^{2} = 0.966, \, p < 0.0005, \, n = 36} \right)$$
(1)
$${\text{TCM}} = 10^{ - 3.822} \left( {\text{DOC}} \right)^{1.575} \left( {\text{Br}} \right)^{ - 0.403} \left( t \right)^{0.329} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.536} \left( {\text{Temp}} \right)^{0.692} \left( {\text{pH}} \right)^{2.330} \left( {{\text{UV}}_{ 2 5 4} } \right)^{ - 0.933} \quad \left( {R^{2} = 0.953, \, p < 0.0005, \, n = 36} \right)$$
(2)
$${\text{DCBM}} = 10^{ - 1.188} \left( {\text{Br}} \right)^{0.411} \left( {{\text{UV}}_{ 2 5 4} } \right)^{1.042} \left( t \right)^{0.259} \left( {\text{Temp}} \right)^{0560} \left( {\text{pH}} \right)^{1.732} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.238} \quad \left( {R^{2} = 0.972, \, p < 0.0005, \, n = 36} \right)$$
(3)
$${\text{T-HANs}} = 10^{ - 1.065} \left( {\text{Br}} \right)^{0.346} \left( {\text{DOC}} \right)^{0.481} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.520} \left( t \right)^{0.238} \left( {\text{Temp}} \right)^{0.373} \quad \left( {R^{2} = 0.943, \, p < 0.0005, \, n = 36} \right)$$
(4)
$${\text{DCAN}} = 10^{ - 0.583} \left( {\text{Br}} \right)^{ - 0.581} \left( t \right)^{0.297} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.577} \left( {\text{DOC}} \right)^{1.452} \left( {\text{Temp}} \right)^{0.472} \quad \left( {R^{2} = 0.933, \, p < 0.0005, \, n = 36} \right)$$
(5)
$${\text{BCAN}} = 10^{0.615} \left( {\text{Br}} \right)^{0.522} \left( {{\text{UV}}_{ 2 5 4} } \right)^{1.117} \left( {{\text{Cl}}_{ 2} / {\text{DOC}}} \right)^{0.511} \left( t \right)^{0.144} \quad \left( {R^{2} = 0.963, \, p < 0.0005, \, n = 36} \right)$$
(6)

The regression models in this study also revealed that pH had positive effect on THMs formation (Eq. 1, 2, 3). This was because the alkaline conditions can facilitate the hydrolysis of intermediate DBPs to form THMs (Xie 2004). As an intermediate product, HANs may also undergo base-catalyzed decomposition (Yang et al. 2007), yet in this study, no significant effect of pH was observed on HANs formation (Eqs. 46,). In addition, DOC/UVA254 other than DON showed positive effects on HANs formation (Eqs. 46), which seems not to be consistent with previous studies that higher level of organic nitrogen led to the more yields of HANs (Reckhow et al. 1990; Nikolaoua and Lekkasa 2001). This may be because not all of the organic nitrogen is potent precursors for HANs. For example, under the same chlorination conditions, the HANs yields from asparagic acid, alanine, tyrosine and tryptophane were great, yet the HANs yields from glycine, isoleucine and other amino acids were negligible (Yang et al. 2012). In other words, the characteristics of the DON rather than the amount of DON may influence the yields of nitrogen DBPs (Chu et al. 2010).

According to the partial correlation coefficients (Table 3), DOC and Br exerted the most important influence on the formation of T-HANs and DCAN, while for T-THMs and TCM, DOC, Br as well as the reaction time showed the dominant effect; as far as the brominated DBPs (BDCM and BCAN) were concerned, the bromide level showed the most important influence. In addition, the partial correlation coefficients of reaction time, temperature and pH in THMs models were dominantly higher than those of HANs. The related mechanism may be associated with the stability of THMs (stable end product) and HANs (relatively unstable, intermediate product).

Table 3 Partial correlations between DBPs yields and the disinfection conditions

The regression models of (1)–(6) were generally valid to describe the experimental data with R 2 values ranging from 0.93 to 0.97, compared with other DBPs formation models with R 2 values ranging from 0.28 to 0.98 (Chowdhury et al. 2009). Independent sample t test in which the significance level was performed also showed that there were no significant difference between the measured date and the predicted data for all of the models (S-Table 6), indicating that the observed data were equivalent to the predicted one. Figure 1 shows the internal evaluation of DBPs models. An ideal simulation corresponds to an intercept of 0, a slope of 1, and R 2 of 1.0 (Sohn et al. 2004). In present study, the slopes of the predicted versus measured THMs ranged from 0.99 to 1.05, and the corresponding R 2 values ranged between 0.94 and 0.96 (Fig. 1a–c), demonstrating that the THMs models had good accuracy and precision. Conversely, the HANs models showed similar accuracy (slope: 0.96-1.02) but less precision (R 2: 0.77–0.92) (Fig. 1d–f). As for the models of T-THM, TCM and BDCM, 86.1–97.2 % of the predicted values fell within ±25 % of the measured values (defined as “normal values”), with only small fractions (0–2.8 %) of extreme values (out of ±40 % of the measured values). While for the HANs models, T-HANs, DCAN and BCAN appear weaker evaluation ability (normal values: 75–83 %; extreme values: 5.6–13.9 %).

Fig. 1
figure 1

Relationships between the predicted DBPs levels and the measured levels

Comparison of T-THMs models developed from DOM with different SUVA values

In order to evaluate the contribution of DOM with different SUVA to DBPs formation, T-THMs model (DOC-based) developed in this study was compared with those from other studies, which also took the similar power function form (i.e. DBPs = k(DOC)aBr−bTempc…..) (Table 4). Considering that several papers developed the models but did not provide SUVA values, the papers used to compare in Table 4 only included those provided SUVA values, and shared the roughly similar water quality and chlorination conditions (Amy et al. 1998; Sohn et al. 2004; Hong et al. 2007, 2008). The coefficients represent the effect of factor on DBPs formation (Sohn et al. 2004). Results show that the highest coefficient of DOC (CDOC) in T-THMs models was observed for the raw water in USA (CDOC = 1.098), followed by the raw water in Pearl River Delta (CDOC = 0.852), the coagulated water in USA (CDOC = 0.801), and the least was the raw water in Yangtze River Delta in this study (CDOC = 0.369).The rank of the DOC coefficients generally agreed with the corresponding SUVA values (Table 4, Eqs. 14), i.e. the higher the SUVA, the higher the DOC coefficient. This may be because the DOM with high SUVA was more reactive precursors for THMs formation (Nikolaoua and Lekkasa 2001; Uyak and Toroz 2007). That is to say if other parameters keep the same, less THMs will form for low SUVA waters.

Table 4 Comparison of regression models of THMs formation during chlorination

Moreover, the coefficients of bromide (CBr) in T-THMs models also appeared to be SUVA dependent (Table 4). CBr for water with high SUVA values was much lower (CBr = 0.068 in Eq. 1) than those with low SUVA values (CBr = 0.218–0.223 in Eq. 2, 4 and 5). The underlying reason may be that the bromide was more reactive with hydrophilic DOM (had low SUVA) as compared with the hydrophobic DOM (had high SUVA) (Liang and Singer 2003). This means that under the same conditions, a greater percentage of bromide could be substituted into THMs molecules for the water with lower SUVA values, and thus make a greater contribution to the formation of THMs. Based on this pattern, it can be inferred that during DBPs formation, bromide tend to be a key factor for the water with low SUVA, and likely to be a minor factor for the water with high SUVA.

Comparison of the models in Table 4 further revealed that the coefficients of reaction time (C t ) were closely related with the length of the reaction time. C t values in short-time models (t = 1–2 h, C t  = 0.34; t = 2–72 h, C t  = 0.305) were generally higher than those in long-time models (t = 2–96 h, C t  = 0.246; t = 6–168 h, C t  = 0.258; t = 2–168 h, C t  = 0.263–0.264). These results indicated that THMs formation was more rapid in short term than in long term.

The above comparisons indicated that the regression models originated from various waters were site specific. The applications of the models for evaluating THMs formation should consider the water quality, especially the SUVA. It not only influenced the total yields of THMs, but also controlled the contribution of bromide to THMs formation. Besides SUVA, the boundary of reaction time as well as other factors also should be taken into account when using one model to evaluate the THMs formation in another place.

Conclusions

The regression models could well evaluate the formation of THMs and HANs during chlorination of source water from Tai Lake, Qiantang River and Jinlan Reservoir. As for the models for TCM, BDCM and T-THMs, 86–97 % of the calculated values fell within ±25 % of the measured values. While the models for DCAN, BCAN and T-THANs showed relatively weak evaluation ability, as only 75–83 % of the calculated values were within ±25 % of the measured values. The DOM level (DOC/UV254) and the bromide concentration were the key factors controlling the HANs formation. In terms of THMs formation, DOM, bromide as well as the reaction time exerted the most important influence. Comparison of T-THMs models generated from various waters shows that the coefficients of DOC and bromide were generally dependent on SUVA values. The T-THMs models from the low SUVA water may have low DOC coefficients but high bromide coefficients. In addition, the coefficients of time seemed to be close related with the length of the chlorination time.