Introduction

Noroviruses are small, round, structured viruses (previously named Norwalk-like viruses) in the family Caliciviridae, and are estimated to cause some 23 million cases of acute gastroenteritis (AGE) annually in the United States alone (Mead et al., 1999). These viruses are also a common cause of AGE outbreaks in healthcare and long-term care facilities, schools and daycares, and on cruise ships, accounting for 68–90% of such outbreaks (Green, 1997; Mead et al., 1999; Fankhauser et al., 2002). Waterborne outbreaks of Norovirus have been linked to contaminated drinking water in North America, Europe, and Asia (Huffman et al., 2003; Parshionikar et al., 2003; Kim et al., 2005; Godoy et al., 2006; Maunula, 2007); water sources may become contaminated by Norovirus due to discharge of sewage or wastewater (Hutson et al., 2004; Maunula, 2007; Murata et al., 2007).

A striking feature of the epidemiology of Norovirus infection is the tendency of these viruses to cause sporadic infections and outbreaks in winter months; this seasonality is so striking that Norovirus gastroenteritis is sometimes referred to as “winter vomiting disease” (McSwiggan et al., 1978). However, as with many infectious diseases, this well-described seasonal pattern of occurrence is poorly understood (Dowell and Ho, 2004). Improved understanding of the mechanisms that drive seasonal patterns of occurrence may lead to improved surveillance methodologies, disease prevention and control strategies, and understanding of disease pathogenesis (Fisman, 2007). Furthermore, understanding the role of the physical environment in the genesis of disease outbreaks is of increasing importance in an era of rapid environmental change (Greer et al., 2008).

Norovirus outbreaks are common in the Canadian province of Ontario, and large numbers of such outbreaks are investigated each year in Toronto, which is the province’s major population center. The Ontario Public Health Laboratory system provides diagnostic support for all regional public health units investigating outbreaks of gastroenteritis in institutions, or in the community. The introduction of molecular testing methods, in 2006, has markedly enhanced the number of identified Norovirus outbreaks available for study. We sought to use a database containing a large and geographically condensed group of Norovirus outbreaks to ask the following question: “What environmental factors are associated with an increased frequency of Norovirus outbreaks in the Greater Toronto Area (GTA) in the winter months?” Our hypothesis was that environmental factors that increase virus survival and persistence in the environment would be temporally related to human outbreaks of Norovirus in the GTA.

Methods

Setting

The GTA is located in southern Ontario, Canada, on the north shore of Lake Ontario. It is comprised of five regions (Toronto, York, Halton, Peel, and Durham) (Fig. 1) and had a population of 5,555,912 in 2006 (Statistics Canada, 2006). Disease surveillance and the investigation of infectious disease outbreaks in each region are conducted by independent public health units. During outbreaks of gastroenteritis in the GTA, stool samples collected by the local public health units are sent to the Ontario Central Public Health Laboratory (CPHL) in Toronto for diagnostic testing. The dataset included outbreaks of gastroenteritis occurring in the GTA and investigated by the laboratory between November 10, 2005 and March 6, 2008 (N = 361), with limited information regarding the location of the outbreak (e.g., hospital, long-term care facility, daycare, etc.), the city, health unit region, date, the number of samples submitted for testing, and the results of the diagnostic testing.

Fig. 1
figure 1

Map of the Greater Toronto Area health regions, number of outbreaks (N) and the corresponding incidence of Norovirus outbreaks identified between November 10, 2005 and March 6, 2008.

We defined “cases” as gastroenteritis outbreaks identified as positive for Norovirus by reverse transcriptase-polymerase chain reaction (RT-PCR) for genogroups G1 and G2, and/or by electron microscopy for Norovirus (n = 253). We treated outbreaks as a single event, rather than a series of independent cases, because our aim was to elucidate the effect of environmental exposure on initial case occurrence, rather than subsequent propagation of cases via person-to-person spread. All of the outbreaks used for this analysis were investigated by public health authorities in the GTA, and were considered to be independent outbreaks (as linked outbreaks would have been reclassified under a single outbreak code).

Environmental Exposures

Environmental exposure data for the study period were obtained from publicly available databases: weather data, based on readings at Pearson International Airport in the Peel Region, were derived from the Canadian National Weather Archive (Environment Canada, 2008); UV radiation estimates were derived from the Toronto spectrophotometer of the World Ozone and Ultraviolet Radiation Data Centre, operated by Environment Canada (WOUDC, 2008); data on lake surface temperatures were obtained from the U.S. National Oceanic and Atmospheric Administration (NOAA, 2008); and river and creek flows for the four principal rivers in the GTA watershed (Don River, Black Creek, Humber River, and the Rouge River) were obtained from the Water Survey Branch of Environment Canada (Water Survey of Canada—Environment Canada, 2008).

Statistical Analysis

We evaluated seasonal and temporal trends in outbreak occurrence, using both Poisson and zero-inflated Poisson regression models that predicted weekly outbreak case counts based on weekly mean exposure estimates. Poisson models included oscillatory smoothers generated using sine and cosine terms, as well as a yearly term. Sine and cosine terms were incorporated to adjust for nonspecific seasonality of disease occurrence. This approach controls for expected seasonal oscillation in incidence, such that effects detected for environmental exposure variables added to the model are those in excess of what would be expected based on stereotyped seasonal oscillation (Diggle, 1990). Because of the extremely low incidence of Norovirus outbreaks during warmer months, we also evaluated effects using zero-inflated Poisson models (Afifi et al., 2007). These models capture summertime hiatuses in Norovirus activity as prolonged “zero-count” periods.

Exposure variables were screened for unconditional associations using univariable Poisson models, and variables with P ≤ 0.25 in univariable analyses were considered candidate variables for multivariable analyses (Dohoo et al., 2003). We also considered variables to be candidates for multivariable models if they were not significant in univariate analyses but if they were identified as statistically significant predictors of case occurrence in the case-crossover analysis (described below). Multivariable models were created using manual, backward elimination with statistically significant variables having P values ≤ 0.2 (Dohoo et al., 2003). Once an appropriate main effects model was identified, two-way interactions between the remaining variables were checked for statistical significance (P ≤ 0.2). We examined the overall fit of the Poisson model using generalized linear model post-estimation. Models were also evaluated by graphically comparing the observed counts and the expected counts for both the Poisson and zero-inflated Poisson models. Poisson and zero-inflated Poisson (ZIP) models were constructed using STATA 9.0 (STATA Corporation, College Station, TX).

Acute associations between case occurrence and environmental exposures were evaluated using a case-crossover study design using SAS version 9.1 (SAS Institute, Carey, NC) (Maclure, 1991). This design is based on comparison of exposure measures during case (“hazard”) and control periods; exposures that are causally related to case occurrence are expected to occur with greater frequency during the hazard period. This study design is useful for evaluating the relationship between brief transient exposures and infrequent outcomes (Maclure, 1991). We used a 2:1 matched case-crossover design with control days matched by day-of-week, and selected from a 3-week time stratum that included the case date. As such, “control” days could be separated from case days by up to 2 weeks, and could follow, precede, or both follow and precede case days. This “random directionality” of control selection was utilized in order to avoid biases that might be associated with underlying temporal trends in exposures (Levy et al., 2001). Odds ratios for case occurrence were estimated using conditional logistic regression models.

Results

During the time period of interest (November 10, 2006 to March 6, 2008), 361 outbreaks of gastroenteritis in the GTA were investigated with the assistance of the Ontario Central Public Health Laboratory. Norovirus was identified as the causative agent in 253 of these outbreaks (70%). Most Norovirus outbreaks (60%) occurred in the urban core of the Greater Toronto Area, with the remainder occurring in suburban regions (Fig. 1). Long-term care facilities accounted for 38% of all Norovirus outbreaks in the dataset, followed by daycares and schools (13%), hospitals and clinics (6%), and restaurants and other food service providers (1%). Outbreaks showed marked wintertime seasonality (Fig. 2).

Fig. 2
figure 2

The occurrence of Norovirus outbreaks in the Greater Toronto Area is highly seasonal, with the majority of outbreaks occurring in the winter months. Each outbreak in the figure represents only the index case (first person from the outbreak identified as positive for Norovirus by reverse transcriptase-polymerase chain reaction (RT-PCR) for genogroups G1 and G2, and/or by electron microscopy for Norovirus). This definition was used to control for Norovirus transmission that occurred from person to person after the first individual became ill. Bars represent the observed number of new outbreaks per week. Curves represent the number of outbreaks predicted by our multivariate regression models (Poisson model is a dashed line and the zero-inflated Poisson model is a solid line).

Poisson and Zero-Inflated Poisson Regression

In Poisson and zero-inflated Poisson models, only two environmental variables (temperature and inverse precipitation) were significantly associated with Norovirus outbreaks in univariate models that controlled for year and seasonal oscillations (Table 1). However, best fit multivariate models (both Poisson and zero-inflated Poisson) included Lake Ontario temperature and mean flow in the Don River; temperature was retained in the Poisson model as well (Table 1). Both models demonstrated excellent fit to available data (Fig. 2). Recent hydrology data for the GTA watershed were unavailable (Don River flow values between January and March 2008). The availability of these data to researchers is generally delayed by several months from the time it is collected, in order to allow for verification and quality assurance review by the scientists at Environment Canada prior to releasing the official dataset. Therefore, the multivariate models are only able to predict the number of outbreaks up to the first week of 2008.

Table 1 Weekly Meteorological and Hydrological Exposures and the Incidence of Norovirus Infections in the Greater Toronto Area (GTA)

Case-Crossover Analysis

As in Poisson models, a substantial risk of Norovirus case occurrence was linked to low Lake Ontario temperatures (<4°C) in the week prior to case onset (Fig. 3) (hazard ratio [HR], 5.61 [95% CI, 2.81–11.12]). More modest risk was associated with higher than average flow rates in the Don River (>2.5 m3/s) in the week prior to case onset (Fig. 3) (HR, 3.17 [95% CI, 2.30–4.36]). For both exposure variables, the highest hazard ratios were found 24–48 h prior to case onset, corresponding to the average incubation period of Norovirus infection (Szucs and Matson 2005)

Fig. 3
figure 3

Low lake temperatures of <4°C (a) and high river flow (>2.5 m3/s) in the Don River (b) in the week prior to case occurrence were linked to increased risk of developing Norovirus. Hazard ratios are presented on a log scale. For both exposure variables, the highest hazard ratios were found 2–3 days prior to case onset. This corresponds to the average incubation period of Norovirus infection. The origin (where the two axes intersect) indicates the day of case occurrence. Error bars represent the 95% upper and lower confidence intervals.

.

Discussion

Norovirus is a common nonbacterial cause of gastroenteritis outbreaks (Green, 1997; Fankhauser et al., 2002) with stereotyped wintertime seasonality (Lopman et al., 2003), but to date the genesis of this seasonality has been obscure. We used both traditional regression models (in which case counts are aggregated) and a case-crossover approach (an individual, case-based approach) to evaluate environmental contributions to Norovirus risk. Using these very different approaches, we identified both acute and longer-term associations between local source-water hydrology and Norovirus outbreak occurrence. In our analysis, the risk of Norovirus outbreaks in the community increased with decreasing Lake Ontario temperature, a finding that is concordant with the striking wintertime seasonality of this pathogen. Likewise, we also found an association between high flow in the Don River and Norovirus occurrence in the community.

Like any single observational study, we are only able to demonstrate a statistically significant association between watershed conditions and Norovirus outbreak risk, and we cannot state with certainty whether such relationships are causal. However, we feel that the results are of sufficient potential importance to warrant further study of the role that an aquatic reservoir may play in Norovirus epidemiology. These preliminary findings warrant further investigations into the specific mechanisms by which local hydrology could influence Norovirus risk in the community. We propose that one such mechanism may be a feedback loop between discharged sewage effluent and drinking water within the GTA. There are a number of sewage treatment plants that discharge treated wastewater into the Don River. In addition, the lower reaches of the Don River have a number of combined sewer overflows (CSOs), whereby, during storm events, raw sewage enters the Don River directly and bypasses treatment altogether. The water in the Don River then travels south into Lake Ontario, which is the source of GTA drinking water. Water is taken up into the municipal water treatment system from Lake Ontario, treated, and then distributed across the city. Given this situation, we would hypothesize that environmental conditions (such as those identified in this study), cause a feedback loop.

We hypothesize that Norovirus outbreaks begin in the community when a susceptible individual ingests water that contains infectious Norovirus particles. This individual and all of the others that they infect by direct contact, shed virus particles into the sewage system for the duration of their infection. The sewage treatment process is not sufficient to remove or inactivate all of the virus (Lodder and Husman, 2005; Pusch et al., 2005; van den Berg et al., 2005). Therefore, virus is discharged to the Don River and optimal environmental conditions, such as low temperature (Jones et al., 2007), allow the virus to survive in the source water until the water enters the treatment system. Due to the extremely small size of this virus, and experimental data to suggest that inactivation by chlorine is ineffective (Albert and Fehlhaber 2007), standard water treatment protocols are not well suited for removing Norovirus (Huffman et al., 2003; Laverick et al., 2004; Pusch et al., 2005; Tree et al., 2005; Gutierrez et al., 2007). Water quality indicators in Ontario do not currently include any viral enteric pathogens. Taken together, this suggests that waterborne Norovirus may be more common than previously thought. The mechanism that we propose is consistent with empirical data related to Norovirus burden in untreated and treated sewage, and in surface waters, in other cities (Hutson et al., 2004; Laverick et al., 2004; Maunula et al., 2005; Pusch et al., 2005; van den Berg et al., 2005; Westrell et al., 2006).

In our multivariable Poisson models, high average weekly flows in the Don River were associated with decreased risk of Norovirus outbreaks. While the source of this discrepancy is unclear, there are several possibilities. It is important to note that these methodologies are fundamentally quite different, with Poisson models evaluating the response of aggregate outbreak counts to average measures of environmental exposure (in this case, averaged over a 1-week period), whereas case-crossover methodology evaluates the acute effect of variability in environmental exposures on the risk of a single outbreak. We believe that the case-crossover result is more biologically plausible, and it is possible that the effect identified in Poisson models results from residual confounding not removed by smoothing (due, for example, to prolonged increases in river flow in the springtime, when Norovirus outbreak risk subsides). Our river-related findings highlight the dichotomy between aggregate and individual measures of health outcomes, and may represent an example of “ecological fallacy” resulting from aggregate-level data. The ecological fallacy is an incorrect assumption for an individual that is based on aggregate data for a group (Robinson 1950; Portnov et al., 2007). This finding highlights the importance of considering several complementary methodologies in examining the effect of environmental exposures on health outcomes. Regardless of the direction of effect, our main findings are strongly suggestive of local watersheds and source waters as important modifiers of Norovirus outbreak risk, and indeed point to local waters as an important reservoir for Norovirus.

Like any observational study, our study is subject to several limitations. Variability in environmental measures and in timing of the onset of outbreaks, and the under-reporting of outbreaks that is common with all public health surveillance systems, could have affected the estimates of effect reported here (Grimes and Schulz 2002; Thomas et al., 2006). However, it is important to note that unless the likelihood of outbreak identification was correlated with a local hydrological condition (which is unlikely), any bias in effects would be towards the null. As such, the effects we report here are likely to represent lower-bound effects. Secondly, we assumed that all outbreaks were independent of one another (i.e., no outbreaks were linked by unrecognized person-to-person transmission); if this was not the case, we would again have been subject to nondifferential misclassification of exposures, which would again have resulted in a bias towards a null effect. A third limitation relates to the absence of data on Norovirus abundance in either the GTA watershed, or in local treated waters. Our epidemiological investigations have identified the collection of such environmental data as a key area for future study.

In conclusion, enhanced understanding of the relationship between environmental and population health can suggest novel strategies for the prevention and control of outbreaks. Our results implicate conditions in the local watershed in the genesis of Norovirus outbreaks in the Greater Toronto Area. These findings lead to a biologically plausible hypothesis for the observed wintertime seasonality of this disease. As a next step, we plan to collect and analyze water samples from across the GTA in order to test the hypothesis we have laid out in this article.