Introduction

Hand, foot, and mouth disease (HFMD) is now considered to be one of the global health threats worldwide (WHO). In recent decades, numerous large-scale outbreaks of HFMD have been common in the Asia-Pacific region, especially in East and Southeast Asia. In 1998, an epidemic of enterovirus 71 infection caused HFMD in thousands of people in Taiwan, some of whom died (Ho et al. 1999). During 2008, Singapore experienced its largest ever outbreak of HFMD, resulting in 29,686 cases (Wu et al. 2010). In addition, epidemics have also occurred in Malaysia (2005) (Chua and Kasri 2011), Vietnam (2011) (Nguyen et al. 2014), and Japan (2011) (Fujimoto et al. 2012). China is a country that is most seriously affected by HFMD. In the spring of 2008, a large, unprecedented HFMD outbreak have been reported among infants and young children in Fuyang City, China, resulting in a high aggregation of fatal cases (22 cases) (Zhang et al. 2010). In recent years, the situation of HFMD infection remains severe in China. During 2010 and 2014 period, the annual report of HFMD cases dominated the first place among infectious disease (average cases per year: 1,887,646), and the death due to HFMD infection approximately ranked fifth (http://www.nhfpc.gov.cn/zhuzhan/yqxx/lists.shtml). Up to now, there are still no specific vaccines and effective treatments for clinical application. Therefore, it is urgently needed to identify the risk factors of HFMD, and forecast short-term HFMD incidence trend for early disease surveillance and prevention.

Currently, weather change as an important risk factor in triggering the onset of HFMD has been documented by many epidemiological studies and the findings were inconsistent (Onozuka and Hashizume 2011; Xu et al. 2015; Yang et al. 2015; Liu et al. 2015a, b; Urashima et al. 2003; Wei et al. 2015). For instance, some studies showed that average temperature or humidity may increase the incidence of HFMD infections (Onozuka and Hashizume 2011; Xu et al. 2015; Yang et al. 2015; Liu et al. 2015a; Wei et al. 2015). However, no significant association or even negative associations were also reported in other studies (Liu et al. 2015b; Urashima et al. 2003). This discrepancy in existing results across regions may be attributed to the differences in weather conditions, demographic characteristics and hygiene conditions. In addition, previous studies on weather change and HFMD in China mainly focused on coastland cities, such as Guangzhou, Shanghai, and limited studies were available in inland cities with significantly distinct climatic features. At the same time, the situation of HFMD is getting worse with increasing incidence in Huainan as shown in Fig. 1. Therefore, it is necessary to further explore the impact of weather change on HFMD in inland areas of China.

Fig. 1
figure 1

The annual morbidity of HFMD from 2009 to 2014 in Huainan, China

Auto-regressive integrated moving average (ARIMA) model has been widely used in the field of environmental epidemiology (Wei et al. 2015; Li et al. 2015), and predicted the morbidity of infectious disease (Earnest et al. 2005; Martinez and Silva 2011; Zheng et al. 2015; Liu et al. 2011). Previous papers have shown that ARIMA model was a very useful tool in predicting the temporal dependence structure of a time series (Zheng et al. 2015; Liu et al. 2011). But not all prediction models are so accurate. In order to promote the forecast, several studies have attempted to obtain better-fitting model through statistical methods (Zheng et al. 2015; Eswaran and Logeswaran 2012). As climate change proceeds, weather factors are known to correlate with infectious disease, but little is known about whether weather factors can promote the forecast model analysis for infectious disease. Although some researchers have recognized the importance of environment temperature acting in prediction of HFMD cases (Wei et al. 2015), they may not systematically evaluate the impact of weather variables on ARIMA model. Meanwhile, previous studies have reported that the weather effects varied in population with different demographic characteristics, such as gender and occupation (Xu et al. 2015; Yang et al. 2015). So we speculated that the effects of weather factors on improving the model-fitting precision may be different among groups and it is far from clear. In addition, as we mentioned above, there are many factors (e.g., pattern of weather effects, epidemic trend, and demographic characteristics) will affect one area to build up the forecast model. Therefore, it is hard to copy a prediction model in one region to predict the incidence of HFMD in another region, which further remind us to feel that it is necessary to carry out this research. The present study aims to analyze the influence of temperature in model- fitting and build up an accurate model for Huainan. The results of our study will assist local public health department to achieve the early detection and early warning.

Materials and methods

Study area

Huainan is located in the North of Anhui Province, and East of China (32° 65′ N, 117° 02′ E), as shown in Fig. 2. It has a population density of 940 persons per km2 (in 2013: population = 2.43 million; land size = 2,585 km2). Huainan features a typical subtropical climate, with the average temperature of 16.7 °C and relatively humidity of 66.5 %.

Fig. 2
figure 2

Geographical location of study area in China

Data sources

Daily counts of HFMD cases in Huainan on 1 January 2009 and 31 December 2014 were obtained from Huainan Center for Disease Control and Prevention (Huainan CDC). Since 2008, HFMD has been categorized into a Class “C” infectious disease, local hospitals and clinics were required to report HFMD cases to Huainan CDC via online communicable disease surveillance system within 24 h. All HFMD cases were first diagnosed symptomatically. Then, the clinical diagnoses of HFMD were strictly examined and verified by local CDC through laboratory evidences. Overall, the reporting representativeness and accuracy of HFMD cases are of high quality (Huang et al. 2013). We collected daily meteorological data on average temperature, relative humidity, rainfall, and barometric pressure from Huainan Bureau of Meteorology. Weekly and monthly data for weather variables were calculated from the average of daily records.

Statistical analysis

First, based on daily data, descriptive statistic was conducted by describing the distribution of weather variables and HFMD cases. Additionally, Spearman correlation test was performed to explore the crude relationships between meteorological variables and HFMD.

Second, due to the potential lagged effect of weather variables on HFMD, Poisson regression models combined with a distributed lag non-linear model (DLNM) was further applied to examine the impact of temperature, relative humidity, rainfall, and barometric pressure on HFMD, respectively. Based on the data satisfying with normality tests of the above-described models, peak seasons (April–June) (Supplementary Fig. S1) were chosen as the study period when analyzing the effect of weather variables on the daily number of HFMD cases. Seasonality and long-term trend were controlled for using a natural cubic spline with three df per year of data. Besides, the other weather factors as latent confounder were controlled in the model when analyzing the influence of one weather indicator on HFMD. Accounting for the lowest Akaike Information Criterion (AIC), we plotted exposure-response curve with lags up to 14 days (Onozuka and Hashizume 2011). The results showed that only ambient temperature was significantly associated with HFMD, and its effect was largest at 7-days lag (Supplementary Fig. S2). Therefore, ambient temperature at 1-week lag was included in the prediction model. In brief, DLNM was used to identify weather factors influencing HFMD. Based on risk factors found in DLNM model, we integrated the risk factors into SARIMA model to fit the HFMD cases.

Third, Expert Modeler was identified and used to analyze time series. In contrast to manually train ARIMA model, Expert Modeler automatically identifies and validated the best-fitting ARIMA parameters for one or more target variables, thus avoiding the the errors due to manually parameter selections. Figure 3 shows the seasonal pattern from January 2009 to December 2014 for the given series. In addition, the model used 312 weeks of disease surveillance and meteorological data. In this case, the Expert Modeler automatically selects Seasonal ARIMA (SARIMA) method for data analyses. SARIMA model is denoted as an ARIMA (p, d, q) × (P, D, Q) S . The model was defined with an auto-regressive part of order p, the number of differencing d, a moving average part of order q, periodic variable s, and the second parenthesis holds their seasonal counterparts. We estimated the model parameters by using the maximum likelihood method. Normalized Bayesian Information Criterion (BIC), and stationary R square (sR 2) were conducted to test the goodness-of-fit of the SARIMA models. Additionally, the Ljung-Box test was performed to measure ACF and PACF of the residuals, which must be equivalent to white noises (P > 0.05). Ljung et al. reported that the tentative model was inade.quate when the p value of Q-statistics was not bigger than 0.8 (Ljung and Box 1978). The sensitive analysis was conducted by changing time unit into months for SARIMA building. All statistical analyses were performed by using SPSS version 16.0 and R version 2.15.

Fig. 3
figure 3

The time-series distribution of daily weather variables and HFMD cases in Huainan, China, during 2009–2014

Results

Descriptive analysis

From 2009 to 2014, a total of 14,152 cases were reported in Huainan with mean weekly cases of 45 (rounded integer for 45.4). Significant gender and occupation differences were observed in the percentage of HFMD. Majority of HFMD cases was found in both males (65.5 %) and scattered children (92.2 %). A decrease in the incidence of HFMD was observed from 2010 to 2011, while this incidence increased in the period 2009–2010 and 2011–2014, reflecting mainly changes in the incidence of reported HFMD (Fig. 1). The average values of weekly mean temperature, relative humidity, rainfall, and barometric pressure were 16.8 °C (range, 0–35 °C), 66.6 % (range, 32–90 %), 2.4 mm (range, 0–33 mm), 1,010.8 kPa (range, 991–1030 hPa), respectively. More detailed information for weather variables and HFMD cases are summarized in Table 1.

Table 1 Characteristics of weekly weather variables and HFMD cases in Huainan, Chian, 2009–2014

Correlation analysis

Table 2 illustrates the Spearman correlations between weather variables and HFMD cases. Daily mean temperature and barometric pressure were notably correlated with HFMD cases (temperature: r s = 0.34, P < 0.01; barometric pressure: r s = 0.36, P < 0.01).

Table 2 Spearman’s correlation coefficients between daily weather variables and HFMD cases in Huainan, China, 2009–2014

Time series analysis

Seasonality of weather variables and HFMD cases were observed in time-series distribution from 2009 to 2014 in Huainan, China (Fig. 3). Considering the collinearity and lagged effects of weather variables, DLNM was further used to examine the relationship between weather factors and HFMD cases. The association between average temperature and HFMD was still significant. However, the barometric pressure effect was not observed (Fig. 4). Therefore, ambient temperature was included in SARIMA model to predict the number of HFMD cases. The results revealed that SARIMA model with temperature was a better fit and validity than the model without it (sR 2 increased, while the BIC decreased) (Tables 3 and 4; Fig. 5). The ACF and PACF of the residual are shown in Fig. S3 (see supplementary materials). Of which, residuals are white noise (significant level P > 0.05).

Fig. 4
figure 4

The cumulative effects of weather factors on HFMD

Table 3 Summary statistics of SARIMA models without weather variables for weekly HFMD cases in different groups (Model_1)
Table 4 Summary statistics of SARIMA models with ambient temperature for weekly HFMD cases in different groups (Model_2)
Fig. 5
figure 5

Results of the prediction model on weekly data in Model_1 (model without ambient temperature) and Model_2 (model with ambient temperature)

Subgroup analysis

HFMD cases were stratified by gender and occupation. In the subgroup analysis by gender, the model fitting for males was obviously better than that of females through the analyzing of Model_2 (SARIMA model with temperature) (Table 4 and Fig. S4). Compared with nursery children, scattered children may be more suitable for using the Model_2 to predict the number of HFMD cases and it has high precision (Table 4 and Fig. S4). The statistics of Ljung-Box Q test in different groups are shown in Tables 3 and 4, where all of the significance levels are greater than 0.05. To check the robustness of our findings, we changed basic time unit selection in SARIMA model, and similar results were found (Tables S1, S2 and Figs. S5, S6).

Discussion

In this study, we systematically evaluated the relationship between weather factors and HFMD, and further explored that the role of ambient temperature played in disease forecast with SARIMA model. Our results showed that temperature was significantly associated with HFMD. Ambient temperature can improve the accuracy of forecasting model. In addition, a more interesting thing was that the goodness-of-fit of SARIMA model varied by gender and occupation.

This paper initially probed the relationship between weather factors and HFMD using rank correlation analysis. Both ambient temperature and barometric pressure were correlated with the occurrence of HFMD. However, we did not find any association between barometric pressure and HFMD when we employed DLNM to examine this relationship. The plausible explanations for this discrepancy could be serious collinearity between weather variables and the influence of lagged effects. Besides, we also found non-significant associations between humidity/rainfall and HFMD, which were inconsistent with previous studies (Yang et al. 2015; Cheng et al. 2014). For instance, two Hefei studies indicated that high relative humidity and extreme precipitation may trigger childhood HFMD (Yang et al. 2015; Cheng et al. 2014); and a Guangzhou study showed positive effect for higher rainfall at lag days four and eight (Chen et al. 2014). These discordant results may be due to the various local weather conditions, population characteristics, and socioeconomic factors of the different regions. Ultimately, temperature was found as a risk factor, and may increase HFMD infection in Huainan, China.

Ambient temperature as an important indicator of weather factors, has profound impacts on infectious disease (Onozuka and Hashizume 2011; Phung et al. 2015; Patz and Olson 2006). The Intergovernmental Panel on Climate Change (IPCC) forecast an increase in global average temperature by 2100 within the range 1.4–5.8 °C (IPCC 2001). Infectious diseases are sensitive to temperature change through multiple mechanisms: first, through changes in rates of survival and reproduction of pathogens (Meerburg and Kijlstra 2009; Patz et al. 2005); second: through influence people’s automatic thermoregulation and immunoregulation system (Xu et al. 2013); and third through geographic shifts of vector or reservoirs (Patz et al. 2014).

In the past decades, more attentions were paid to investigate the effects of weather factors on HFMD, but few researchers have involved in the establishment of HFMD forecast (Wei et al. 2015; Liu et al. 2015b). In this study, we combined both the advantage of DLNM and SARIMA model to investigate weather factors influencing HFMD. The method of DLNM was proposed by Gasparrini et al. (2010), which simultaneously fits the potential non-linear relationship and lagged effect of exposure. This method is now widely applied in the field of climate change and human health. Meanwhile, ARIMA model is considered to be one of the most widely used time-series forecasting methods due to its structured modeling basis and acceptable forecasting performance (Wong et al. 2005). This study take the advantage of both DLNM and SARIMA model to investigate the influence of weather factors and its role in short-term incidence trend forecast.

According to Fig. 1, the morbidity of HFMD has a rising trend, which indicates HFMD is still a major public health problem with a high burden in Huainan, China. Thus, it is highly cost-effective to detect a HFMD epidemic in its early stages in order to prevent and reduce the impact of HFMD. Therefore, early weather warning based on accurate forecast model is very important for personal protection, community intervention and policy making. In this paper, we applied a SARIMA (p, d, q) × (P, D, Q) S model to analyze the surveillance data of HFMD in Huainan, China. First of all, we directly built the SARIMA model by HFMD cases itself, indicating a relatively high accuracy (sR 2 = 0.77; BIC = 5.80 in Table 3). Ambient temperature has been found to be associated with HFMD and SARIMA model allows the integration of external factors (e.g., weather variables), so we speculated that the model combined with temperature variable would be better for forecasting weekly number of HFMD cases in Huainan, and our results confirmed our hypothesis (sR 2 = 0.83; BIC = 5.50 in Table 4). The previous technique generally improved the goodness-of-fit of model through statistical approaches (Zheng et al. 2015; Eswaran and Logeswaran 2012), our research may provide a new sight for enhancing the accuracy of ARIMA model.

Based on Model_2, we further investigated whether the effects of temperature on the accuracy of SARIMA model varied by gender and occupation. Our analyses suggested that the models were a better fit in males and scattered children than other groups. To our knowledge, males and scattered children have more outdoor activity than girls and nursery children, which may increase their exposures to temperature changes, thus further increase the possibility of HFMD infection. These observations not only indirectly underlined the relationship between temperature and HFMD, the role of temperature in model fitting also have been identified. On the other hand, compared with scattered children, the sample size of nursery children is small, thus it may influence the precision of SRIMA model. Nevertheless, more relevant studies are needed to test our findings and give more reliable explanations.

This study has two major strengths. First, we systematically evaluated the impact of weather pattern in Huainan, China, and then ambient temperature was used to update the fitting precision of SARIMA model. Second, although previous studies have used ARMIA model to fit and forecast HFMD of some regions in China (Wei et al. 2015; Liu et al. 2015b), but the fitting of the model is not good enough (R 2 < 0.70), which led to the instability of their forecast results. In order to conduct a stable and effective model, we employed the method of Expert modeler combined with ambient temperature to build a SARIMA model for predicting HFMD cases and the results showed a better-fitting of the model.

Several limitations should be noted when interpreting the results. First, this study is ecological in design nature and some biases due to exposure misclassification may be inevitable. Second, the data were collected from one city, which might not be generalized to other regions with distinct weather pattern. Third, due to the use of few weather stations in Huainan, the weather collected may not be representative of the exposure of whole population, which possibly influences the accuracy of HFMD forecasting.

Conclusions

In this paper, the effect of weather factors on HFMD was analyzed. Ambient temperature was finally selected as a relevant weather factor for building of SARIMA model. Comparative analyses indicated that the model combining ambient temperature has better performance. Our results will provide valuable information for the policy makers and public health sectors to construct a best-fitting model and optimize HFMD prevention.