Introduction

Dengue fever is an arboviral disease transmitted by Aedes genus female mosquito especially Aedes aegypti (Messina et al. 2015). Studies indicated that globally around 400 million dengue infections occurs annually and nearly four billion people lives under the direct risk of dengue transmission (Bhatt et al. 2013). In the recent years, dengue has expanded in many countries which were considered dengue free earlier (Brady et al. 2013). In Nepal, dengue fever is a relatively new disease which was first reported about a decade ago in 2004 (Pandey et al. 2004). However, in short period, the disease has spread rapidly covering wide geographical areas of the country especially in the southern low-land Tarai and less elevated hill districts putting almost two thirds of the population under the direct risk of the disease.

Spatial distribution of dengue fever is determined by complex interaction of environmental, geographic, and socioeconomic factors (Wijayanti et al. 2016; Méndez-Lázaro et al. 2014; Lin and Wen 2011; Khormi and Kumar 2011). Several previous studies have recognized role of meteorological factors such as temperature, precipitation, and relative humidity in the spatial distribution of dengue fever (Méndez-Lázaro et al. 2014; Wu et al. 2009; Limper et al. 2016). Other studies identified the importance of socioeconomic factors as main driving forces for occurrence and outbreak of dengue fever (Khormi and Kumar 2011). The role of proximate variables including distance to major roads, water bodies, and health facilities (Wijayanti et al. 2016) are also important for spatial variation of dengue fever. Recently, remote sensing application is increasingly being used in the human health studies (Beck 2000). Satellite-based precipitation estimation-Tropical Rainfall Measuring Mission (TRMM) (Méndez-Lázaro et al. 2014), normalized difference vegetation index (NDVI) (Arboleda et al. 2009; Ge et al. 2016; Troyo et al. 2009), normalized difference built-up index (NDBI), normalized difference water index (NDWI), and land surface temperature (LST) (Méndez-Lázaro et al. 2014; Roslan et al. 2016) were widely used as the environmental proxies to assess the spatial variation in dengue and other mosquito borne disease.

Previous studies have investigated spatial association of dengue fever with various potential risk factors (Qi et al. 2015; Wijayanti et al. 2016). Most of these studies were based on global model which assumes the relationships between the predictors and the outcome variable are homogeneous (or stationary) across the study area. Further, global model assumes normal distribution and no specific autocorrelation in the dataset. Therefore, global models produce parameter estimates which represent an “average” type of behavior (Fotheringham and Brunsdon 2010). However, in practice, the relationships between variables might be non-stationary and vary geographically (Cressie 1993). According to Tobler, “everything is in the space is related with everything but closer thing is more related than distant thing” (Tobler 1970). A global model used to assess the spatial association violates the assumption of normal distribution and explains only little deviance. As a result, a predictive map based on such global model is usually subject to high errors, especially in the areas with weak relationship between predictors and outcome variables.

To address limitations of the global model, several local statistical methods have recently been developed for the empirical spatial analysis. The local statistics seeks spatial association between predictor variables and outcome variable on the one hand and heterogeneity on the association on the other. Local forms of spatial analysis also provide a linkage between the outputs of spatial techniques and the powerful visualization capabilities of geographic information system (GIS) and some statistical graphics packages (Fotheringham and Brunsdon 2010). Local indicator of spatial association (LISA) (Anselin 2010), local Gi* (Ord and Getis 2010) and local Moran’s I (Anselin 2010), geographically weighted regression (GWR) (Brunsdon et al. 1998), and spatial regressions are some of commonly used local spatial statistics. GWR among others is the most widely used multivariate local statistics proposed by Brunsdon et al. to cope with spatially non-stationary processes that allowed to change parameter locally (Brunsdon et al. 1998, 2010). It should be noted that GWR approach does not assume that relationships vary across space but is a means to identify whether or not they do. If the relationships do not vary across space, the global model is an appropriate specification. GWR has been used widely to assess the spatial relationship in the various field including but not limited to the land use change, urbanization, and various infectious disease under the broad field of epidemiology (Corner et al. 2013; Ge et al. 2016; Zheng et al. 2014).

However, all the variables considered in the GWR modeling may not vary and some of them may exhibit global effects (Ribeiro et al. 2015). Considering this situation, Brunsdon et al. proposed a semi-parametric geographical regression model as a mixed modeling approach where some parameters are fixed globally but others vary locally (Brunsdon et al. 1999). In most recent studies which applied the s-GWR model in the spatial analysis showed better model fit compared to local GWR and global model (Ehlkes et al. 2014; Manyangadze et al. 2016; Mondal et al. 2015). Several software are now available to estimate global model (OLS) and local model (GWR) including ArcGIS, QGIS, spgwr package of R, and GWR 4.0. As GWR has also implemented mixed model (s-GWR) along with local GWR and Global (OLS), we used GWR 4.0 which is not available in other software. In addition, GWR 4.0 has also implemented significant test (t test) of local parameters.

In this study, we focused on the Jhapa district which is one of the highly dengue-affected districts in Nepal. In Jhapa, first dengue fever case was reported in 2011, 7 years after its introduction in the country (C and Agarwal 2014), with a big outbreak recorded in 2013 (Acharya et al. 2018). A total of 200 laboratory confirmed cases were reported from across the district. However, very few cases were reported in 2014 (8 cases) and 2015 (2 cases). After relative silence in 2014 and 2015, another big outbreak occurred in this district in 2016. A total of 312 cases were reported during the peak outbreak year of 2016 until the end of the October. However, there is no spatially explicit research being conducted in local level to understand the spatial epidemiology of dengue fever and associated potential risk factors. To fill this gap, we studied spatial distribution and associated potential environmental and socioeconomic risk factors of dengue fever by comparing local (GWR), global (OLS), and mixed (s-GWR) modeling approach in Jhapa district of Nepal.

Materials and methods

Study area

Jhapa district is located in the south east part of Nepal (26.36° to 26.80° North and 87.63° to 88.20° East) bordering India in south and east and Morang and Ilam districts in the west and north, respectively. Most of the land is flat and average elevation is less than 300 m. Northern parts of the district are occupied by hills. Administratively, Jhapa is divided into 48 Village Development Committees (VDCs) and three municipalitiesFootnote 1 with a total of 470 wards (Fig. 1). Since the use of smaller spatial unit has shown to provide valuable information on the distribution of disease over space (Matisziw et al. 2008), the lowest administrative unit ward polygons were taken as a spatial unit for this analysis. Jhapa district observes subtropical monsoon climate with the summer temperature from 32 to 35 °C and winter temperature from 8 to 15 °C. The district receives about 250–300 cm annual rainfall, most of which occurs during the monsoon season (June–September). Average population density of the district is 510/km2, nearly three times higher than the national average (180/km2) (Central Bureau of Statistics (CBS) 2011b). The distribution of population is uneven and concentrated mainly in the urban center along the highway. The East-West highways passing from northern part of the district connects it with the capital city while the North-South highway joins it with eastern hill districts. These two highways play significant role for movement of people and goods in and out of the district. We chose this district for this study due to its highest dengue incidence and availability of disease data.

Fig. 1
figure 1

Location of the study area. The color map is false color composite of Landsat 8 OLI (2013-10-19) based on bands 6, 5, and 3 (RGB)

Dengue data

Dengue fever is a high public health concern infectious disease in Nepal. It is reported weekly to the Epidemiology and Disease Control Division (EDCD) of the Government of Nepal from all the private and public hospitals through the early warning reporting system (EWARS) (DOHS 2015). During the epidemic, the cases and death of dengue fever including other five public health important infectious disease are reported immediately (within 24 h) to the EDCD. In this study, we collected 6 years (2011–2016) of laboratory confirmed, based on either immunoglobulin M (IgM) tests or polymerase chain reaction (PCR) tests, dengue cases summarized at ward level from the EDCD. These cases include both dengue fever and dengue hemorrhagic fever cases.

Explanatory variables

Considering the previous studies and data availability, seven potential environmental and socioeconomic risk factors including population density (Araujo et al. 2014; Lin and Wen 2011), proximity to road (Hsueh et al. 2012; Mahabir et al. 2012; Qi et al. 2015), proportion of urban area (Wijayanti et al. 2016; Qi et al. 2015), LST (Araujo et al. 2014; Méndez-Lázaro et al. 2014; Roslan et al. 2016), NDVI (Arboleda et al. 2009; Moreno-Madriñán et al. 2014; Qi et al. 2015; Troyo et al. 2009), NDBI, and NDWI (Estallo et al. 2012) were selected to explain ward-level spatial variation of dengue fever in Jhapa district, Nepal.

The ward-level population data was obtained from the Central Bureau of Statistics (CBS) from 2011 census (Central Bureau of Statistics (CBS) 2011a). The population density was then estimated by dividing total population of each ward by total area of the respective ward polygon. Ward and higher level administrative boundary of Nepal along with other GIS layers were obtained from the Department of Survey of the Government of Nepal which were prepared based on the topographical base map of 1995 A.D. The proximity to major road (East-West Highway and North-South Highway) was computed based on the Euclidean distance in the ArcGIS software and proportion of urban area was computed with built-up class based on the land cover map with 30 m spatial resolution prepared by ICIMOD (Uddin et al. 2015).

Other four variables were derived from remote sensing which were obtained using the Landsat 8 Operational Land Manager and Thermal Infrared Sensor (OLI/TIRS) imagery (Path/Row: 139/42) dated on 2013-10-19 A.D. Semi-automatic classification plugin implemented in QGIS (http://qgis.com/) was used to process the Landsat image. Radiometric calibration was performed at first to convert the satellite digital numbers to at-satellite reflectance using gains and offsets obtained from the image metadata. Atmospheric correction was further employed to convert at-satellite reflectance to surface reflectance using the dark object subtraction (DOS) method (Chavez 1996). The LST was computed based on thermal band (Band 10) and NDVI, NDBI, and NDWI were computed using the green, red, near infrared, and short-wave infrared bands with following the formula summarized in Table 1.

Table 1 Environmental indicators computed based on LandSat8 OLI

As dengue data was in ward-level aggregation, all dependent variables were also summarized in ward level for their mean value using zonal statistics function implemented in ArcGIS.

Mapping and clusters detection

The spatial distribution of dengue incidence rate was mapped in ArcGIS taking 6 years (2011–2016) average incidence rate with base population of 2011 census. We assumed that base population of 2011 did not change significantly during the study period. The Global Moran’s I (Moran 1948) was calculated to evaluate and quantify the overall spatial autocorrelation or spatial dependence of dengue incidence in the study area. The Global Moran’s I is a widely used indicator of spatial autocorrelation. Its value ranges from − 1 to 1, where 1 indicates a perfect positive correlation, 0 implies perfect spatial randomness, and − 1 suggests a perfect negative spatial autocorrelation. Significance of Global Moran’s I was assessed at 95% confidence interval using the z test. Mathematically, it is expressed as:

$$ I=\frac{N}{\sum_i{\sum}_j{w}_{ij}}\frac{\sum_i{\sum}_j{w}_{ij}\left({X}_i-\overline{X}\right)\left({Y}_j-\overline{Y}\right)}{\sum_i{\left({Y}_i-\overline{Y}\right)}^2} $$
(1)

where N is the total number of ward, Xi and Xj are the spatially smoothed incidence rate (SSIR) of wards i and j, \( \overline{X} \) is the average SSR of all wards, and wij is the element of the spatial weight matrix corresponding to the wards pair i and j.

Anselin’s Local Indicators of Spatial Association (LISA) technique using the Local Moran’s I Statistic (Anselin 2010) was used to identify and map the local clusters of unusually high dengue rates. LISA computes a measure of spatial association for each individual location. A local Moran’s I autocorrelation statistic at the location i can be expressed as

$$ {I}_{\mathrm{i}}={z}_{\mathrm{i}}\sum \limits_{\mathrm{j}}{w}_{\mathrm{i}\mathrm{j}}{z}_{\mathrm{j}} $$
(2)

where zi and zj are the standardized scores of attribute values for unit i and j, and j is among the identified neighbors of i according to the weights matrix wij.

Modeling the spatial relationship

The spatial relationship of ward-level disease incidence and potential environmental and socioeconomic risk factors were assessed using the OLS, GWR, and s-GWR model. Before running these models, the Pearson correlation test was conducted and highly correlated risk factors (r > |0.7|), if any were excluded. These models were compared based on R2, adjusted R2, and Akaike information criterion (AICc) to identify the best-fit regression model. The R2 value indicates a model’s ability to explain the variance in the dependent variable, and thus a higher R2 implies a better model performance. The AICc is an indicator of model accuracy and smaller AICc value indicates improvements in a model performance (Ribeiro et al. 2015; Tu and Xia 2008). However, the rule-of-thumb is that the difference in AICc should be 2 or higher for the substantive difference on the goodness of fit (T. Nakaya et al. 2005). The best model was chosen with minimum AICc with difference of more than 2 and maximum R2 value. We used freely available GWR 4.0 (T. Nakaya et al. 2005; Tomoki Nakaya 2016) software to model the spatial relationship of dengue incidence with potential environmental and socioeconomic risk factors. Diagnosis of residuals of the final model was assessed using the Moran’s I test. Moran’s I indicate any misspecification or missing of key variables to explain the spatial patterns. Additionally, observed and predicted dengue cases were mapped and compared with correlation and scatter plot to assess the predictive performance of the final model.

The OLS regression model was applied first to assess the global relationship between dengue incidences and the selected risk factors. The method of least square is expressed in Eq. 3.

$$ {y}_{\mathrm{i}}=a+\sum \limits_{j=1}^k{a}_{\mathrm{j}}{x}_{\mathrm{i}\mathrm{j}}+{\varepsilon}_{\mathrm{i}} $$
(3)

where yi is the ith observation of the dependent variable, ajxij is the ith observation of the Kth independent variable, and εi is the error terms. Such global model assumes that the rate of neighborhood i is independent of neighboring j and that residuals are normally distributed in terms with zero mean.

Secondly, we applied GWR to analyze the relationship between dengue incidences and associated risk factors which varies from one ward to another. Geographically weighted regression model is a simple extension of traditional regression model (Eq. 3) which can be expressed mathematically in the following equation:

$$ {y}_i=\sum \limits_{j=1}^k{\beta}_k\left({u}_i{v}_i\right){x}_{il+}{\varepsilon}_i $$
(4)

where (ui, vi) is coordinate for each location i., in our cases it is the centroid of each ward. In geographical weighted regression model, (Eq. 4) (ui vi) is an additional term to the global regression model. This is a weight term which is generally called kernel and is determined by the principle of Tobler’s First Law of Geography (Tobler 1970). In this analysis, we used adaptive bi-square kernel for geographically weighting since it is suitable for clarifying local extents for model fitting and keeping constant the number of areas to be included in the kernel (Tomoki Nakaya 2016). The golden search method was used to automatically and efficiently determine the optimal bandwidth size for geographically weighting. The optimal bandwidth and the associated weighting function were obtained by choosing the lowest AICc score.

Finally, we applied s-GWR model treating some predictors as local while others as global. The s-GWR model can be expresses in the following equation:

$$ {y}_i=\sum \limits_{j=1}^k{a}_j{x}_{ij}+\sum \limits_{l=k+1}^p{\beta}_k\left({u}_i{v}_i\right){x}_{il+}{\varepsilon}_i $$
(5)

Equation 5 is the combined form of previously mentioned two equations (Eqs. 3 and 4) where the first aj denotes the global parameter estimates of fixed independent variables, and ik (ui,vi) denotes the local parameter estimates on each location i in space. In the mixed modeling approach (s-GWR), it is necessary to assess which of the selected independent variables exhibit local and which exhibit global patterns.

We used both geographical variability test and global to local variables selection approach to find actual global and local term using the GWR 4.0. These methods follow a rationale similar to the one used in a stepwise regression model selection process where model with lower AIC or AICc values is usually selected. In geographical variability test, GWR 4.0 software compares the model comparison criterion such as AICc between original and switched GWR model. For this, the GWR software fits GWR model with all selected variables as spatially varying terms and computes the AICc. In the next step, another model fits in which one variable is switched as fixed term while all other kept as a varying coefficient. If the switched GWR model attains a statistically better fit, the value of the model comparison indicator, i.e. AICc, is smaller than that of the original GWR model suggesting no spatial variability in the selected term. In this condition, “Diff of Criterion” column, which shows the difference in model comparison indicator between the original GWR model and the switched GWR model, becomes a positive value. In the reverse condition, the selected variable is considered spatially non-stationary and “Diff of Criterion” becomes negative. The test routine repeats this comparison for each geographically varying coefficient. For computational simplicity, the compared models are fitted with the same bandwidth as the fitted model.

Like geographical variability test, global to local variable selection approach fits global model with all selected variables and switch them one by one as a local term with other terms remaining unchanged in the switched model and compares model fits. If the switched model better fits, the selected variable is considered local and in the reverse condition as a global term. The test routine repeats this comparison for each selected variable. Unlike fixed band width of geographical variability test, bandwidth selection is applied for each compared model.

The local parameters such as local R2, coefficients of local terms, and predicted cases with associated residuals were mapped using the ArcGIS10.3. Mapping local parameters facilitates interpretation based on spatial context and known characteristics of the study area (Goodchild and Janelle 2004). However, mapping only parameter estimate alone is misleading, as the map reader has no way of knowing whether the local parameter estimates are significant anywhere on the map (Matthews and Yang 2012). Therefore, statistically insignificant areas where pseudo t do not exceed ± 1.96 were masked as insignificant (Matthews and Yang 2012; Ehlkes et al. 2014; Wabiri et al. 2016).

Results

Spatial distribution and local and global clustering patterns

A total of 605 cases of dengue fever were reported during the period of 2011 January to 2016 October from Jhapa district, out of which 568 cases were geocoded in ward level. Dengue case reported from outside the district and the cases without ward-level information were excluded. At least one case of dengue fever was recorded from 56 out of 470 wards of the district. The distribution of 5-year averaged dengue fever incidence rate was presented in Fig. 2b which shows substantial variation in the distribution dengue incidence rates in the district. The highest rate (more than 500 cases per 100,000 person) was observed in the core town of Mechi municipality (ward 10) and Damak municipality (wards 10, 13, 14) located in the eastern and western margin of the district while lower rates were observed in the surrounding areas of the core town. The global autocorrelation assessment with low p value and higher z score suggests global clustering pattern (Moran’s I = 0.48, z = 25.054, p < 0.001). Local Moran’s I approach identified 20 wards from Damak, Lakhanpur, and Mechinagar as the significant local clusters. The location of the high clusters is presented in Fig. 2a, b with purple color symbol.

Fig. 2
figure 2

(a) Aggregated counts of 2011–2016 reported dengue fever cases and (b) associated raw rates per 100,000 in Jhapa district. Clusters of high rates are identified using local Moran’s I and are outlined in purple color

Identification of spatially varying environmental and socioeconomic risk factors

Based on Pearson’s correlation test (Table 2), we removed highly correlated (r > |0.7|) explanatory risk factors (NDBI and NDWI). After removing these two variables, there were other five potential environmental and socioeconomic risk factors, namely NDVI, LST, population density, proximity to road, and proportion of urban area for further analysis. Spatial distribution of these variables summarized at ward level was mapped in Figs. 3 and 4 respectively.

Table 2 Correlation matrix of dengue incidence and potential risk factors based on Pearson’s correlation test (p < 0.5)
Fig. 3
figure 3

Spatial distribution of spatially stationary variables: (a) NDVI and (b) LST in Jhapa district

Fig. 4
figure 4

Spatial distribution of spatially non-stationary explanatory variables: (a) population density, (b) proximity to road, and (c) urban proportion

Spatial association of these five potential risk factors was evaluated based on OLS, GWR, and s-GWR using the GWR 4.0 (T. Nakaya et al. 2005; Tomoki Nakaya 2016) software.

Table 3 summarizes the result obtained from GWR 4.0 (T. Nakaya et al. 2005; Tomoki Nakaya 2016) where comparison of OLS, GWR, and s-GWR in terms of AICc, R,2 and adjusted R2. The global regression model OLS explained only 22% of variance of ward-level dengue incidence in the district which increased to 69% when model was calibrated as geographical regression model considering local effects of all the five predictor variables (Table 3). The model fit was significantly improved reducing the AICc values from 5569 to 5187. The results of geographic variability test and local to global variable selection approach showed DIFF of Criterion (Table 4) positive values of LST and NDVI (Tomoki Nakaya 2016) suggesting no spatial variability on them. These two variables remained as global terms while other three explanatory variables were considered as local terms. Accordingly, final model was calibrated considering the global effect of NDVI and LST while other three as the local terms. This regression model explained about 79% variance again further reducing the AICc from 5187 to 5054.27 thereby improving the model fit. Therefore, s-GWR model with NDVI and LST as global terms and population density, proportion of urban area, and proximity to road was selected as the final model.

Table 3 Comparison of model performances based on three fitness parameters:AICc, R2, and adjusted R2
Table 4 Summary of s-GWR models for different potential environmental and socioeconomic risk factors

The Moran’s I of the residuals of the final s-GWR model was − 0.056960 (z score = − 2.627548 and p value = 0.0086) which indicated no significant spatial autocorrelation in the residual and that the model was well specified. The validity of the final model was further assessed comparing the observed and predicted dengue cases statistically (Fig. 5) and spatially (Figs. 1a and 8a). The Pearson’s correlation coefficient (r = 0.88, p < 0.05) showed higher association between observed and predicted dengue cases.

Fig. 5
figure 5

Statistical association between observed and predicted dengue cases, in Jhapa district, Nepal

Figure 6 showed the local coefficients of determination R2 (i) of the final s-GWR model which was not homogenously distributed in all the wards of the district. Overall, it showed strong predictive power in Mechi Nagar and Damak municipalities coinciding with high dengue cluster (Fig. 1a). Moderate strength with the local R2 value between 0.3–0.5 was observed in the central part of the district and low in other parts of the district. Lower R2 (i) values in other parts of the district indicates poor regression fit.

Fig. 6
figure 6

Spatial distribution of local r2 of the s-GWR model of 6-year averaged dengue incidence rates in Jhapa district, Nepal

Table 4 shows the association of finally selected explanatory variables in the s-GWR model. LST was positively associated globally but its influence in the model was minimum. Association of NDVI value in the model was globally negative and stronger than LST. Other three variables exhibited strong ward-level variation in the association. The variation of the estimated local coefficients and associated t statistics is shown in Fig. 7. The area with significant coefficient at 95% confidence interval where pseudo t did not exceed ± 1.96 (Wabiri et al. 2016) was symbolized with bivariate graduate color while areas with insignificant t values were masked by grey color.

Fig. 7
figure 7

Geographically weighted regression parameters (a) population density, (b) proximity to road, and (c) proportion of urban area; significant areas at ± 1.96 level of the s-GWR model

The average association of population density was moderately positive (βpop density = 3.93). However, the strength of association varies greatly in the district. In the east around Mechi municipality and the surrounding areas, strong positive relationship was observed while in the west around Damak municipality, the association was negative. In the central part of the district, impact of population density is moderately positive.

The average association of proximity to road was negative. However, the strength of association varied greatly in the district. There was strong negative association (βprox road ≤ 150 = − 7.824721) in Damak municipality and surrounding areas which was much weaker in the east. A Positive association (βproportion of urban area = 78.684616) was observed between proportions of urban area and dengue incidences, the strength of the associations was strong in the east compared to west.

Figure 8a, b summarizes spatial distribution of predicted dengue fever rates and distribution of residual respectively. Higher rates were identified in 30 neighborhoods, 20 of which matched from the observed rates. Overall, residuals are confined in the range of 229.879058–759.073333 cases per 100,000 individuals with a mean − 0.468506.

Fig. 8
figure 8

Predicted dengue fever rates and residuals in (a) and (b), respectively. Higher values are determined using local Moran’s I Statistic and highlighted with purple color. Use observed in Fig. 1b for the comparison

Discussion

Dengue fever has been one of the major public health problems in Nepal especially in low-land Tarai since the last few years. The higher elevated hilly and mountainous districts in the north are free from this disease probably due to the absence of vector mosquitoes owing to low temperature. Dengue is a newly emerged disease which was first time reported in 2004 A.D. Sporadic cases were reported every year until 2010. Nepal experienced two major outbreaks in 2010 and 2013 with 917 and 642 laboratory confirmed cases. Five individuals died from dengue infection in 2010. Low-land Tarai districts, especially Chitwan and Jhapa, are the most vulnerable to dengue although it has recently expanded to hilly districts too. Jhapa is one of the worst dengue-affected districts in Nepal despite its recent emergence (DOHS 2015). The present study mapped spatial distribution and explored global and local clustering pattern of 6 years (2011–2016) averaged dengue fever incidence rate using the ward-level areal data. In addition, spatial association of dengue fever with various potential environmental and socioeconomic risk factors was assessed by comparing OLS, GWR, and s-GWR regression models.

Results of this study showed that dengue fever in Jhapa district during the study period was heterogeneously distributed and spatially clustered at ward level, the lowest administrative unit in Nepal indicating small-scale focality of the disease. The results are consistent with findings from previous studies conducted in different dengue-endemic regions of the world (Delmelle et al. 2016; Wijayanti et al. 2016; Arboleda et al. 2009; Lin and Wen 2011). To the best of our knowledge, this is the first local-level dengue study in Nepal which explained the spatial association of dengue and associated factors in Nepal although coarse-level spatial distribution and cluster identification work was carried out earlier (Acharya et al. 2016). The finding could be useful for the local-level policy formulation and implementation of dengue control.

Our study showed importance of mixed geographical modeling approach of local-level risk factors analysis by comparing global (OLS), local (GWR) and mixed (s-GWR) model. Our analysis showed the limitations of the OLS method to explain spatial variation of dengue incidence in terms of predictive performance and model accuracy and complexities compared to the GWR model. We showed that both predictive performance and model accuracy can be further improved through the implementation of s-GWR model. These findings are concurrent with schistosomiasis study in South Africa (Manyangadze et al. 2016), malaria in Ghana (Ehlkes et al. 2014), and urban expansion in India (Mondal et al. 2015). However, when predictor variables do not exhibit spatial non-stationarity, global regression model (i.e., OLS) is generally recommended to avoid the model complexity instead of GWR or s-GWR (Ramezankhani et al. 2017).

Our final s-GWR model explained highest deviance (R2 = 0.76) among selected three regression model. The deviance not explained by our model could be due to nonlinear effects of selected variables, missing other potential risk factors, and immune status of the host population (Gubler and Clark 1995). Moran’s I showed no significant autocorrelation in the residuals and confirms the variables considered in this study were able to predict the spatial distribution of dengue fever. High concordance of the model inferences with observations (Wijayanti et al. 2016) showed high predictive performance of our final s-GWR model. Although the spatial pattern in the predicted rates is higher than the observed rates (Figs. 2b and 8a), it is likely due to the GWR smoothing effects (Delmelle et al. 2016).

A major benefit of the local spatial statistics including GWR is their ability to visually represent the varying strength of relationship between the dependent and independent variables (Buck 2016) and facilitates interpretation based on spatial context and known characteristics of the study area (Goodchild and Janelle 2004). The variation in local R2 over the wards revealed strong regional differences of dengue transmission processes in the study area. The local R2 showed that the local model had higher performance in hotspots areas compared to the other parts of the study area matching with similar previous studies from Colombia (Delmelle et al. 2016) and South Africa (Manyangadze et al. 2016).

Concurrent with previous studies (Lin and Wen 2011; Ren et al. 2017; Delmelle et al. 2016; Qi et al. 2015), three socioeconomic factors such as proportion of urban area, proximity to road, and population density were the most important risk factors for spatial variations of dengue incidence in Jhapa district during the study period. High population density (Wijayanti et al. 2016; Araujo et al. 2014) and availability of artificial breeding sites (e.g., water-storage containers, aquariums, traditional bath tubes) are generally attributed for elevated dengue risk in urban areas (Wijayanti et al. 2016; Wu et al. 2009; Akhtar et al. 2016). Higher population density may lead to higher vector-host contact rates and higher incidence rate. Similarly, higher dengue risk has been reported in areas close to road compared to the place distant from road (Mahabir et al. 2012). The road transportation plays a significant role in the long-distance spread of dengue virus given the limited flight range of dengue vectors (Qi et al. 2015). However, our finding also revealed spatial heterogeneity between these risk factors and dengue incidence patterns. Therefore, intensity and direction of associations greatly varied from one ward to another and sometimes in opposite directions. In the east, dengue incidence was highly concentrated in the core urban area (Mechi Nagar-10) with high population density and impervious surface during the study period. Therefore, local effects of population density and proportion of urban area are significantly high in this area but low with proximity to road. However, reverse association was observed in the west around Damak and surrounding areas. In Damak and surrounding areas, high incidence rate was observed with little dispersed patterns from the core urban area covering some neighboring wards with low population density and comparatively less urban previous surface than the east. Therefore, proximity to road is one of the most important risk factors in the west for the transmission of dengue with moderate local effects of urban proportion and negative association with population density. Spatial non-stationary relationship with population density, road, and urban was also observed in the previous studies (Delmelle et al. 2016; Ren et al. 2017; Lin and Wen 2011) in other dengue-affected regions.

The NDVI and LST exhibited global effects among five finally selected potential environmental and socioeconomic risk factors in the distribution of dengue incidence possibly due to small study area with little variation in vegetation and temperature dynamics (Homan et al. 2016; Qi et al. 2015). Negative association of dengue fever with NDVI is consistent with several other previous local-level studies (Troyo et al. 2009; Araujo et al. 2014) but inconsistent with some other studies (Martínez-Bello et al. 2017). The discrepancies might be due to resolution, spatial aggregation unit, non-linear relationship between dengue and NDVI (Qi et al. 2015). The association of LST with dengue was weak positive compared to the NDVI. Positive relationship of LST was also found increasing risk of dengue infection with a decreasing minimum night-time temperature (Wijayanti et al. 2016).

The findings of this study have direct implication for health policy and decision making. Dengue being a highly focal disease, health authorities should always consider selecting micro-geographical areas: in this case, the ward rather than macro (district) for control and intervention program. The method adopted could be valuable tool to find such high-risk areas. Secondly, we suggest that the government efforts in control and intervention program should be concentrated in densely populated areas, urban centers, and areas along the major roads especially highly urbanizing areas. However, the authority should be careful about the geographical heterogeneity of potential risk factors, therefore, should be aware that dengue control and intervention strategies may not be same spatially and universally suitable all times.

Our study inherits some limitations which need to be addressed in forthcoming study. The possible under reporting in dengue cases due to poor surveillance and data management system may introduce bias in our study. Similarly, we could not include some important local-scale risk factors such as intra-urban mobility and migration patterns, quality of the health care system, and treatment-seeking behavior of different social groups as well as extent and coverage of dengue control programs in our analysis due to data unavailability. Likewise, GWR model is sensitive to kernel type and bandwidth selection method and result matters on how these parameters are implemented in the analysis. The nonlinear effects of the predictors could not be included in our analysis. Despite these limitations, this is the first spatially explicit dengue research in Nepal to map and explore potential environmental and socioeconomic risk factors in one of the highly dengue-affected district of Nepal at lowest administrative unit. The methodological framework developed in this study is transferable in other regions and at different spatial scales depending upon the data availability, as well as to other mosquito-borne diseases. Finally, this study demonstrate the importance of mixed geographical regression modeling approach in the spatial analysis of disease and other phenomena affected by complex environmental and socioeconomic factors at the local scale.

Conclusion

This study explored and analyzed the spatial distribution of dengue fever incidence and its relationship with various potential environmental and socioeconomic risk factors in Jhapa district of Nepal. This research revealed that dengue fever distribution in Jhapa district was heterogeneous and highly clustered at ward level. Proportion of urban area, proximity to road, and population density were the most important risk factors responsible for the spatial variation of the disease incidence. This study also demonstrated importance of mixed geographical modeling (e.g., s-GWR) approach in order to improve accuracy of predictive model. This evidence can be used for control and management of the disease at micro scale. Future research should consider including more risk factors that may further improve the performance of the s-GWR models in determining the local variation of dengue infection intensity.