Introduction

Chlorination of water to prevent microbiological contamination results in the formation of a wide range of organic compounds known as disinfection by-products (DBPs) (Richardson 2011; Rook 1974). More than 600 DBPs have been detected, but only a small number have been identified (Richardson 2011). These compounds have potential adverse effects on human health. Trihalomethanes (THMs) and haloacetic acids (HAAs) are the most prevalent DBPs in drinking water. Their formation is relatively well understood and their levels are regulated in various countries (especially THMs) (Richardson 2011; Rodriguez et al. 2004; Singer 2002). However, most DBPs are not regulated. For example, haloacetonitriles (HANs), haloketones (HKs), and halonitromethanes (HNMs) are non-regulated DBPs present in water treated with chlorine or in combination with alternative disinfectants (Hua and Reckhow 2007; Krasner et al. 1989; Plewa et al. 2004; Richardson 2011). HANs and HNMs are nitrogenous DBPs with higher cytotoxicity, genotoxicity, developmental toxicity, and growth inhibition than carbon-based DBPs (Liu and Zhang 2014; Muellner et al. 2007; Richardson et al. 2007). HKs are among the non-regulated DBPs presenting the highest levels in drinking water reported in previous studies (Krasner et al. 2006; Wei et al. 2010).

Small water systems (i.e., serving 5000 or fewer people) supplied by surface waters are generally vulnerable to high DBP levels. These systems present a relatively low capacity to implement adequate treatment technologies and an inability to hire qualified operators to manage operational conditions (Coulibaly and Rodriguez 2004; Edwards et al. 2012). Consequently, water treatment processes employed in small systems are usually less efficient to remove DBP precursors than in larger ones (Dore et al. 2013). Our previous study showed that average measured concentrations of DBPs in small water systems (SWS) were much higher than those reported in the literature for medium and large systems (Guilherme and Rodriguez 2014). DBPs represent a real challenge for SWS. As the cost of laboratory analyses is high, SWS have difficulties to adequately characterize the spatio-temporal variability of DBPs.

Most tools to estimate DBP levels, for example regression models, have been devoted only to regulated DBPs (for example Abdullah and Hussona 2013; Di Cristo et al. 2013; Ged et al. 2015; Sadiq and Rodriguez 2011; Zimoch and Lobos 2014). In this paper we present the development of models designed for SWS to estimate the occurrence of non-regulated DBPs based on information of relevant parameters easily and regularly monitored, including regulated DBPs. As only few information on non-regulated DBP occurrence in SWS is available, this research is based on data generated during an intensive sampling program that we carried out in a large diversity of SWS in Canada.

In order to develop the models, parameters most correlated with non-regulated DBP concentrations were first identified through bivariate correlation analyses. Various parameters influence DBP formation and speciation like the nature and amount of organic matter (Karanfil et al. 2008), disinfectant type and dose (Adams et al. 2005; Bougeard et al. 2010; Bull et al. 2009; Crittenden et al. 2005) as well as season and location within the distribution system (DS) (Guilherme and Rodriguez 2014, 2015). All parameters influencing non-regulated DBP occurrence were then brought together in multilinear regression models to estimate the concentrations of these compounds in specific locations of the DS. In this study and according to our knowledge, this is the first time that the occurrence of non-regulated DBPs in small municipal systems has been modeled from information obtained through a robust sampling program.

Methodology

Case studies

An intensive and structured sampling program was carried out in 25 SWS in two provinces of Canada, Newfoundland and Labrador (NL), and Quebec (QC). System sampling campaigns were conducted monthly over 1 year from September 2010 to October 2011. All systems were supplied by surface water sources and most of them used chlorine as the main disinfectant (for primary and secondary disinfection). Only one SWS used chloramines. Systems served a population varying from 330 to 6220 inhabitants. Systems in NL did not present any prior treatment to chlorination, whereas in QC, systems mostly implemented conventional treatment processes prior to disinfection. The four different water treatment processes observed within the 25 SWS under study are presented in the results section (“Influence of treatment conditions on DBP speciation”).

Sampling and analysis

Water was sampled in the water treatment plant (WTP) just before disinfection (and after filtration if a treatment prior disinfection was present) and in various locations along the DS in order to collect treated water samples at different residence times (Table 1). Water was sampled at the beginning (DS1), middle (DS2), and end of the DS (DS3). Samples were collected by water operators (in NL) and by U. Laval personnel (in QC). Samplers were trained to follow equivalent sampling processes for both provinces. Following field collection, the samples were sent to the U. Laval laboratory for analysis.

Table 1 Parameters measured during the sampling campaigns

In WTP, indicators for precursors of DBPs were estimated using various parameters (e.g., ultraviolet absorbance at 254 nm (UV-254) and dissolved organic carbon (DOC)). DOC primarily represents total dissolved organic carbon (humic substances and non-humic substances) and UV-254 is an indicator of aromatic compounds in water. SUVA represents the ratio UV-254/DOC × 100, an indicator of carbon aromaticity. Turbidity was also measured. Other parameters were measured (pH and temperature) due to their importance in DBP speciation or formation kinetics (Bull et al. 2009; Liang and Singer 2003; USEPA 2006; Zhang et al. 2013). Residual disinfectant levels (free chlorine) and DBP concentrations were measured in treated water after chlorination. The sampling strategy allowed us to obtain about 800 measurements for UV-254, DOC, and turbidity, about 500 measurements for pH and temperature, and about 1000 measurements for free residual chlorine. Some parameters were not sampled in every location because we sought only to gain a general overview of water characteristics in the DS. Table 1 summarizes parameters measured at each sampling point. Information on water quality of all SWS sampled is available in the Supplementary Table S1.

Five families of DBPs were considered in the study: THMs, HAAs, and three families of non-regulated DBPs, HANs, HNMs, and HKs. Four THMs (chloroform (TCM), bromodichloromethane (BDCM), dibromochloromethane (DBCM), and tribromomethane (TBM)), five HAAs (monochloroacetic acid (MCAA), monobromoacetic acid (MBAA), dichloroacetic acid (DCAA), trichloroacetic acid (TCAA), and dibromoacetic acid (DBAA)), four HANs (dichloroacetonitrile (DCAN), trichloroacetonitrile (TCAN), bromochloroacetonitrile (BCAN), and dibromoacetonitrile (DBAN)), one HNM (chloropikrin (CPK)), and two HKs (1,1-dichloropropanone (DCP) and 1,1,1-trichloropropanone (TCP)) were analyzed during the study. The quantification limit for THM species was 3.7 μg/L for TCM, 2.0 μg/L for BDCM, 3.3 μg/L for DBCM, and 2.7 μg/L for TBM. The quantification limit for all HAA species was 1.0 μg/L. The quantification limit for HANs, HNMs, and HKs was 0.01 μg/L.

This sampling strategy allowed us to obtain 900 measurements of each DBP under the study. Complementary information about analytical methods used is provided in our previous study (Guilherme and Rodriguez 2015).

Modeling

The software product IBM© SPSS© Statistics Version 21.0.0.0 was used for statistical analyses. The modeling approach aimed at estimating non-regulated DBP concentrations from mixed linear regression analysis. This method maximizes the covariance between independent variables and the dependent variable (that is, HAN, CPK, and HK concentrations) to obtain optimal estimations. Physico-chemical parameters of water (pH, DOC, UV-254, temperature, turbidity) at WTP and THM and HAA levels in the DS were considered as independent variables. Also, linear mixed regression models include additional random-effect terms, and are appropriate for representing dependent data, for example when data are gathered over time on the same individuals (distributions systems here) (Minalu et al. 2011). The Akaike Information Criterion (AIC) was used for model selection (Akaike 1973). The models were optimized by selecting only the most significant explanatory variables in the model through backward elimination. Step-by-step, variables were rejected from the model based on the optimization of AIC values (as low as possible). The fitness and performance of the regression models were estimated by the coefficient of determination (R 2), normal probability plot of residuals, residuals versus predicted and data order plots, and the validity of Shapiro-Wilk test on residuals (Razali and Wah 2011). To simplify interpretation, the coefficient of determination, R 2, is a pseudo-R 2 defined as the squared correlation coefficient between the observed and the predicted response (Spearman’s correlation), based on the definition of R 2 in linear regression (Xu 2003).

We chose to propose models for various types of treatments and seasons to improve the conditions for model applicability. To simplify, all different forms of treatments were classified in two categories, with or without treatment that can remove natural organic matter (NOM) prior to disinfection. This classification was used because of the large gap between DBP levels observed in treated water between systems without treatment and systems presenting at least clarification and filtration processes. Only systems using chlorination as secondary disinfectant were selected. Thus, all NL systems were included in the “without treatment prior to disinfection” category, as well as one system from QC (12 systems). In the “with treatment prior to disinfection” category, only systems from QC were included (13 systems). Seasons were also taken into account by separating the warm semester (summer and fall) and the cold semester (winter and spring) because of the large gap between DBP levels in warm and cold semester. DBP levels in the DS were chosen in specific locations that typically correspond to their maximum level measurement. Thus, regulated DBP levels correspond to levels observed at the end of the DS (DS3) for THMs and at the middle of the DS (DS2) for HAAs. These locations for regulated DBPs were chosen on the basis of regulations for THMs (Health Canada 2012; MDDELCC 2012) and previous studies that concluded that the HAA maximum level is measured near the middle of the DS (Guilherme and Rodriguez 2014, 2015; Rodriguez et al. 2004). Non-regulated DBP levels were also estimated at their maximum level locations of the DS, corresponding to the level measured at the middle of the DS (DS2). In fact, our previous study showed that non-regulated DBP levels variability along the DS is more associated with HAA variability than THMs (Guilherme and Rodriguez 2014, 2015). In WTP, DOC, UV-254, turbidity, SUVA, pH, and temperature were taken into account. Also, free chlorine (Free_Cl) levels in DS3 were included in the models as they are ordinarily measured at this point by water operators. Considering treatment conditions, seasons and specific locations for non-regulated DBPs improve the statistical power and conditions of application of the models.

Validation of developed models

The models developed were validated using a different database generated in a previous investigation conducted with the Quebec Environment Ministry (Ministère du Développement durable, de l’Environnement et de la Lutte contre les changements climatiques). In 2010, a sampling campaign was carried out in ten municipal water systems in Quebec. Five of these systems were small systems also investigated in this research. Others were medium water systems (MWS). Two sampling campaigns were conducted: one in winter and one in summer. Each system was sampled in the WTP and in six locations along the DS. Four THMs (TCM, BDCM, DBCM, and TBM), five HAAs (MCAA, MBAA, DCAA, TCAA, and DBAA), four HANs (DCAN, TCAN, BCAN, and DBAN), and two HKs (DCP and TCP) were measured along the DS. Also, various water quality parameters (DOC, UV-254, pH, and temperature) were measured at every location, from WTP to the end of the DS. Sampling and analyzing protocols were similar to the protocols presented above. Only one SWS used disinfection as only treatment, all the other SWS and MWS had treatments to remove NOM prior disinfection.

Results and discussion

Because of the diversity of factors influencing DBP occurrence, we conducted bivariate correlation analyses between the observed levels of non-regulated DBPs and the following factors: treatment conditions, WTP water quality parameters, and the levels of regulated DBPs (THMs and HAAs). By taking into account all bivariate analyses, multivariate models were finally developed to represent the simultaneous influence of the most correlated factors. Except for treatment conditions influence (presented in “Influence of treatment conditions on DBP speciation”), the data of all systems were considered for the statistical analyses.

Influence of treatment conditions on DBP speciation

In order to illustrate the influence of treatment conditions (type of treatment and disinfectant used), only four SWS with different treatments were selected among all SWS studied in QC and NL in order to compare the same number of data in each category of treatment. In the selected systems during summer (July–September) and winter (January–March), non-regulated DBP levels are influenced by the type of treatments and disinfectant used (Table 2). Indeed, in SWSa and SWSb, HAN, CPK, and HK levels were almost 60% lower in systems using chloramines than in systems using chlorine. Previous studies have shown that monochloramine is less reactive than free chlorine, and forms DBPs at much lower concentrations than free chlorine (Carlson and Hardy 1998; Crittenden et al. 2005). However, this observation concerning non-regulated DBPs has never been observed in small systems. Also, the type of treatment influences non-regulated DBP levels in SWS. The presence of treatments prior to disinfection (such as clarification and filtration) that reduce NOM levels before chlorination in SWSb, SWSc, and SWSd reduce DBP formation (Table 2). Also, the type of primary disinfectant (such as ozone or chloramines) can influence DBP levels and speciation (Yang et al. 2007; Richardson 2011; Mao et al. 2015). However Table 2 reveals that DBP levels in systems without treatment prior chlorination were strongly higher than in systems presenting treatments prior disinfection (regardless of the disinfectant). Thus, thereafter all different forms of treatments were classified in two categories, with or without treatment prior to disinfection, due to the large gap between DBP levels. Also, it can be noticed that non-regulated DBP levels were higher in small systems like SWSb, SWSc, and SWSd than levels already measured in the literature in larger systems (Golfinopoulos et al. 2003; Krasner et al. 2006).

Table 2 Average levels of non-regulated DBP in location DS2 during summer (July, August, and September) and in winter (January, February, and March) in four SWS using different types of treatment (number of observations per DBP per season = 3)

Influence of water quality characteristics on non-regulated DBP levels in the DS

Regulated and non-regulated DBP levels fluctuate also between seasons (Supplementary Table S1). In order to compare seasonal DBP levels, a Kruskal-Wallis test was performed to detect a significant statistical difference in DBP mean levels between seasons. Results showed that THM, HAN, and HK levels are significantly different depending on the season (at 1% level of significance according to Kruskal-Wallis with SPSS©). HAAs and CPK levels were not significantly different according to the season probably due to some extreme levels observed in few SWS in winter for HAAs and the frequent low levels of CPK observed in most SWS (data not shown). In order to identify seasons presenting the most similar DBP levels between each other, a Mann-Whitney test was performed for all DBP levels during the four seasons (Supplementary Table S2). Results of the Mann-Whitney test revealed that the couple winter and spring and the couple fall and summer represent the pairs of seasons presenting most similar DBP levels (as revealed by differences in DBP levels in Supplementary Table S1).

Table 3 presents Spearman correlation factors between characteristics of water before disinfection and non-regulated DBP concentrations in the treated waters of all SWS in NL and QC. Results show that organic matter indicators such as UV-254 and DOC are correlated with non-regulated DBP levels. These NOM indicators are commonly used as surrogates for DBP precursors, such as hydrophobic NOM (Singer 1994). Also, results show that SUVA, turbidity, pH, and temperature are not strongly correlated with DBP levels. Indeed, turbidity measurements are related to overall suspended particles in water (Crittenden et al. 2005). And several studies suggested that SUVA is a poor indicator for non-regulated DBPs such as HANs, HKs, and HNMs (Hu et al. 2010; Hua et al. 2015). Also, the results of Table 3 show that pH is weakly and negatively correlated with DBP levels. Indeed, pH affects the reactivity of free chlorine. Depending on the pH, free chlorine can be present in different forms: hypochlorous acid (HOCl) and hypochlorite ion (OCl) (pKa at 25 °C = 7.5) (White 1992). Since HOCl is more reactive than OCl, organic matter is most efficiently halogenated at pH below the pKa of HOCl (Bull et al. 2009). Thus, this may explain the negative correlation between pH and DBPs. Also, previous studies have shown pH affects the stability of non-THM DBPs (Fang et al. 2010; Yang et al. 2007). Indeed, increasing pH may enhance degradation rates of DCAN and DCP, thereby reducing their levels (Nikolaou et al. 2000). As pH affects DBP formation, stability, and degradation, their relationship may be more complex than a simple bivariate correlation, which may explain the weak Spearman correlation factors. Also, usually reaction rates increase with increasing temperature. If the compounds are relatively stable (e.g., chloroform), their formation increases with rising temperatures (Nikolaou et al. 2000). But, at the same time, rising temperatures can enhance the decomposition rates of several unstable compounds like DCP, TCP, DBAN, and DCAN (Bond et al. 2011; Nikolaou et al. 2001; Zhang and Minear 2002). Thus, temperature and pH influence DBP formation kinetics and speciation (Singer 1994; Chowdhury et al. 2009); however, they are not directly related to DBP precursors, and so, their influence on DBP levels are not highlighted by our bivariate correlations. In summary, organic matter indicators like DOC, UV-254, and SUVA were the most significantly correlated parameters with the occurrence of non-regulated DBPs in the SWS under study.

Table 3 Spearman correlation matrix between water quality characteristics of treated water at the WTP (after filtration and before disinfection) and non-regulated DBP levels in the DS in all SWS (number of observations for each parameter = 300)

Correlations between regulated and non-regulated DBPs

Results presented in the last section show that levels of non-regulated DBPs in the SWS under study depend on season, treatment type, and NOM indicators. It is also useful to evaluate whether levels of regulated DBPs can be used as surrogates for evaluating the levels of non-regulated DBPs. Table 4 presents correlations between DBP levels in location DS2 with a Spearman correlation matrix. Correlations were calculated in one DS location in order to control the locational (spatial) variations along the DS. Results show that HAN and HK levels are strongly correlated with levels of regulated DBPs and that CPK is moderately correlated with both regulated DBPs. These observations agree previous studies observing correlations on spatial and temporal variations between HANs and regulated DBPs (Croue and Reckhow 1989; Wei et al. 2010; Gan et al. 2013; Bond et al. 2015). Even if HANs are nitrogenous DBPs, they remain influenced by NOM indicators (Chen and Westerhoff 2010). These observations are encouraging, in that regulated DBPs can, in some way, be used as surrogates for non-regulated DBPs in small systems.

Table 4 Spearman correlation matrix between DBP average levels in location DS2 (number of observations for each parameter = 300)

Table 5a and b present correlations between variations of DBP concentrations along the DS in summer and winter. Spatial variations are represented by ratios, i.e., the ratio between DBP levels in DS2 and DS1 and ratio between DBP levels in DS3 and DS2. Thus, the correlations are also based on the ratio values. Results show that correlations between regulated and non-regulated DBP variations were slightly different depending on location in the system (between DS1 and DS2 or between DS2 and DS3). However, the results of Table 5 reveal that HAA variations within the DS are more correlated with non-regulated DBP variations than THM variations, especially in summer. Thus, seasons seem to influence not only global non-regulated DBP levels (as mentioned in “Influence of treatment conditions on DBP speciation”), but also their spatial variability in small systems. Finally, it is important to note that levels of correlation between each couple of non-regulated DBP families were comparable.

Table 5 Spearman correlation matrix between DBP variation ratios (a) DS1 and DS2 winter/summer; (b) DS2 and DS3 winter/summer (number of observations for each parameter in each season = 75)

To better understand the relationships between non-regulated and regulated DBPs, correlations between these two types of compounds were investigated according to parameters known to influence these relationships (levels of NOM indicators, locations within the DS, and the seasons). To do so, four categories of DOC values were selected (Supplementary Table S3). Figure 1 presents the conditions at which regulated DBP levels (THMs or HAAs) are the most correlated with HAN, CPK, and HK levels. Accordingly, when DOC is lower and at the end of the DS and during the warm semester (summer and fall), HANs are more correlated with HAAs. This supports the fact that variations of HANs are more correlated with HAA variations than THM variations (Table 5). Conversely, HANs seem mostly correlated with THMs when the DOC level is high and at the beginning of the DS, or during the cold period. Figure 1 also shows that CPK is correlated mostly with HAAs, especially at the end of the DS and during the warm semester. Finally, also according to Fig. 1, HKs are mostly correlated with HAAs except when the DOC level is very high or when the DOC level is low and in the cold semester. This supports the fact that CPK and HKs variations are more correlated with HAA variations than with THM variations (Table 5).

Fig. 1
figure 1

Identification of the most correlated regulated DBPs (Spearman correlation factors indicated) with non-regulated DBPs according to DOC levels before disinfection, season, and location in the DS

Modeling levels of non-regulated DBPs in small water systems

The next step in this study was to consider together the various parameters that have moderate or high correlations with non-regulated DBP occurrence and variation (identified above) in order to develop multivariate models for estimating the levels of these substances in the SWS. Models were developed for various types of treatments and seasons. Seasons were separated in two semesters based on the results obtained in “Influence of water quality characteristics on non-regulated DBP levels in the DS”: warm semester (summer and fall) and cold semester (winter and spring). Table 6 summarizes the results of the linear mixed regression models that associate non-regulated DBPs with regulated DBPs and WTP water characteristics. The models generated present high R 2 values from 0.77 to 0.91. HAN models present R 2 values from 0.77 to 0.91, CPK models from 0.79 to 0.88, and HK models from 0.79 to 0.91 depending of the season and the treatment employed in the system. These values are comparable to R 2 values reported in a review on THM and HAA models (Sadiq and Rodriguez 2011). Generally speaking, the models highlighted the main correlations brought to light previously in “Correlations between regulated and non-regulated DBPs.” Indeed, HAAs concentration is the most significant variable for most models estimating HAN, CPK, and HK levels in SWS (fifth column in Table 6). Also, THMs are a significant variable for non-regulated DBP models in systems without treatment prior to chlorination (presenting probably higher DOC levels in WTP), as revealed in Fig. 1. Finally, the relatively high correlations between regulated DBPs and non-regulated DBPs in the bivariate analysis in Table 4 and between UV-254 and DOC with non-regulated DBPs in Table 3 explain the presence of these parameters in all the models. Also, even if bivariate correlations (Table 3) between SUVA, pH, temperature, turbidity, and non-regulated DBPs are weak (their relationship could not be defined by bivariate correlations), these water quality parameters still appear in some multivariate regression models in Table 6.

Table 6 Multivariate regression models for non-regulated DBP levels

Additional models were developed excluding DOC (and consequently SUVA) from the explanatory parameters. In fact, DOC is not a regulated parameter and its cost for analysis is relatively high compared to other operational parameters such as UV-254, pH, temperature, and residual chlorine. The results show that the performance of models that exclude DOC as an explanatory variable (sixth column of Table 6) is slightly lower than original models that include DOC, but remains quite comparable. On the contrary, results show that models that exclude THMs and HAAs as explanatory variables (seventh column of Table 6) are less performant than the original models and present lower R 2 values. Excluding regulated DBPs has a greater impact on model performance than excluding DOC. Fortunately, regulated DBP levels information is generally available for small systems through regulatory monitoring.

Validation of HAN and HK models

The developed models were finally validated using a different database generated by the Quebec Environment Ministry (Développement durable, de l’Environnement et de la Lutte contre les Changements Climatiques, MDDELCC) in 2010 (Information available in “Validation of developed models”). Data of physico-chemical parameters in water before chlorination, THMs at the end of the DS, and HAAs at the middle of the DS were incorporated into the models presented in Table 6 depending on the season and treatment employed (models included DOC and regulated DBPs). Data for one SWS were used for models corresponding to systems without treatment prior to disinfection. Data for the rest of the systems were used in models for systems with treatment prior to disinfection. The results of models with the validation database were compared to average observed levels of non-regulated DBP levels in the middle and end of the DS. Figures 2 and 3 present the correlation levels between observed values of HANs and HKs and the model estimated values. Globally, the predictions are very good. For all systems, and although the estimations are generally lower than observed values, the correlation is strong (R 2 = 0.68 for HANs and R 2 = 0.92 for HKs). Moreover, if we consider only the five SWS of the validation database, the correlations are even stronger (R 2 = 0.85 for HANs and R 2 = 0.95 for HKs). These results confirm that the models developed are effective at estimating non-regulated DBPs in SWS and can be generalized to other systems.

Fig. 2
figure 2

Validation of HAN models: correlation between observed and estimated values in a all systems, b only small systems

Fig. 3
figure 3

Validation of HK models: correlation between observed and estimated values in a all systems, b only small systems

The underestimation of model values in comparison with observed values may be associated with the fact that the limit level of quantification (LQ) for HANs was lower in the study used to develop models (LQ = 0.1 μg/L per HAN) than in the study used to validate the models (LQ = 0.4 μg/L per HAN). Thus, levels in the validation database may be overestimated, especially because BCAN and BDAN levels are often close to the LQ. Generally speaking, differences between estimated and observed non-regulated DBP are lower for HANs (between ± 0.03 and ± 84%) than for HK (between ± 2.0 and ± 110%).

Conclusions

This study revealed that variability of non-regulated DBP levels in small systems is statistically influenced by the variability of several parameters and conditions: type of treatment, WTP water quality characteristics at the source and the treatment plant (especially DOC and UV-254 levels), location in the DS, and seasons. DOC, UV-254, and SUVA levels were particularly correlated with DBP levels. On the contrary, pH and temperature were not strongly correlated with DBP levels, pH levels are even weakly and negatively correlated with DBP levels (Table 3). Results also revealed that the levels of non-regulated DBP levels are also correlated relatively highly with HAA and THM levels. Depending on the type of treatment, the season, and location in the DS, regulated DBPs could be used as surrogates for non-regulated DBPs. Results showed that non-regulated DBPs are better correlated with HAAs except when DOC levels are very high (i.e., at the beginning of the DS) or when DOC levels are low (i.e., at the end of the DS) and in the cold semester. In these conditions, non-regulated DBPs are better correlated with THMs. However, in these situations, several studies revealed that HAA and THM level variations are similar (Rodriguez et al. 2007). That explains why non-regulated DBPs are globally better correlated with HAAs. When HAA and THM behaviors are similar, non-regulated DBPs are better correlated with THMs (and so, HAAs), and when their behavior are different, non-regulated DBPs are globally better correlated with HAAs. It may be explained by the degradation of some species of HAAs (especially DCAA) and HANs as well as CPK in summer, in particular, when water approaches the extremity of the DS, as already observed in the past for larger systems or under laboratory conditions (Bayless and Andrews 2008; Chang et al. 2010; Lebel et al. 1997; Mercier-Shanks et al. 2013). On the contrary, in cold semester (winter and spring) or in the presence of a high DOC level, it is unlikely that HAN or CPK would be degraded and would be probably be formed continuously along the DS, like THMs. Thus, even if THMs are the most regulated and monitored DBPs in drinking water worldwide, Fig. 1 suggests that they cannot be used alone as surrogates for non-regulated DBPs in the small systems under study (in most conditions of precursor levels, season and location). On the contrary, HAA variability appears a better indicator for levels of non-regulated DBPs.

Finally, linear mixed regression models were developed to estimate levels of non-regulated DBPs based on levels of HAAs, THMs, free chlorine, and several physico-chemical water quality parameters. Model estimations were quite good and their validation with data from other campaigns demonstrated the capacity for generalization of the models developed.

The models can be used for various applications. They can be used for routine operational purposes, for example to estimate the benefits of handling pH or residual chlorine levels regarding non-regulated DBP occurrence. They can also be used for infrastructure planning purposes, in particular for evaluating the potential impacts of removing organic matter by treatment on non-regulated DBP levels. Finally, the models can be used to estimate population exposure to non-regulated DBPs occurring in drinking water of small systems, an important issue for regional risk analysis regarding potential health impacts of DBPs in drinking water.

This study has some limitations. Specific characteristics of operational conditions (e.g., precise chlorine doses) and distribution system hydraulics (e.g., residence time of water) could not be documented. Also, future studies should also consider other compounds as other HNMs and HKs, as well as iodinated and nitrogenous DBPs that have high toxicological relevance and for which there is still very little information for SWS.