Introduction

Children in sub-Saharan Africa experience disproportionately huge burden of morbidity and mortality. About 180 deaths per 1,000 live births occur in the region [1], and mostly from relatively small number of infectious diseases [2]. Among these, diarrhoea, pneumonia and malaria are a common cause and inflict the largest burden. For example, a recent national sample survey carried out in Malawi reported that the prevalence of fever, diarrhoea and pneumonia was 44, 22 and 45% respectively [3]. At outpatients’ clinics in the country, malaria is the most common cause of attendance followed by acute respiratory infections like pneumonia [4]. Although the trend is dropping, these diseases, together with diarrhoea, are still a major cause of mortality in most sub-Saharan African countries. In Malawi, in particular, malaria, diarrhoea and pneumonia cause 14, 18 and 23% of overall under-five childhood mortality [4]. In many ways than not, these illnesses occur simultaneously, largely because of common risk factors, and probably due to overlap between multiple risk factors, or that one disorder creates an increased risk for the other. Co-morbidity of any of these illnesses exacerbates severe disease, and expedites early and high childhood mortality [5, 6].

The last few years has seen increased attention aimed at reducing the burden of diseases in poorly-resourced countries like in sub-Saharan Africa. With improved sanitation and increased access to safe drinking water, coupled with other interventions such as insecticide treated bednets, vitamin A and other micro-nutrient supplements, as well as vaccinations, the burden of the diseases can dramatically be reduced [7]. Moreover, successful control of certain diseases like malaria and HIV/AIDS can lead to lower risk of the other diseases like pneumonia and diarrhoea, thus there is potential public importance and synergetic opportunities for combined control [2, 6]. Remarkably, there is not much research to model multiple disease outcomes, moreover, little is known about geographical overlaps in these illnesses, as this may improve our understanding of the epidemiology of the diseases for efficient and cost-effective control. Simultaneous modelling of diseases would be more appealing, leading to identification of common risk factors or overlapping multiple risk factors, which is critical for integrated management of these diseases [8, 9]. Evidence-based utilization of resources may require knowledge of local and geographical variability of risk [10]. Spatial modelling has emerged as a valuable approach for explaining small area variations in such outcomes [8, 11].

This study is motivated by the analysis of childhood morbidity in Malawi, using the 2000 Malawi Demographic and Health Survey [12]. The survey recorded self-reported morbidity patterns of common illnesses like fever, diarrhoea and pneumonia, among under-five year old children within 2 weeks preceding the survey. Previous studies have used binary response models to analyse the probability of individual illness [13, 14]. However, the diseases often co-exist in the same eco-epidemiological settings, and may share common risk factors [5]. Thus much work remains to be done to develop a better understanding of childhood co-morbidities. To study pattern of co-morbidity, combinations of overlapping illnesses reported in a child can be constructed to give a multi-categorical response, which can be analysed by multinomial regression models. An alternative approach for modelling several diseases is to extend the binary models to the multivariate setup, which is more natural to analyse more than one disease simultaneously. Two approaches of joint analysis of diseases have emerged: the shared component model [8, 9], and multivariate random fields model [15, 16].

The modelling framework proposed in this paper is the multinomial approach. The multinomial approach is preferred because it allows separate treatment of the co-morbidities versus a control or baseline group. The multinomial models have received a lot of attention [17], and have recently been extended to incorporate spatial random effects to deal with unstructured extra-multinomial heterogeneity and spatially structured variations within the framework of generalized linear mixed models [18, 19]. A model framework for analysing space-time multi-categorical responses has also been reported [19, 20].

Spatial random effects allow to quantify the effects of unobserved environmental factors that are represented by geographical locations. Some of these factors operate at small scale, while others operate at large scale, thereby inducing similarities in risk pattern for neighbouring areas than those further apart. When these are estimated and mapped they may be compared to known spatial patterns of possible explanatory factors, or they may provide leads for further epidemiological investigation. Incorporation of spatially correlated prior also permits smoothing for increased precision of effects, which is necessary when sparse counts are observed at small area [18]. Furthermore, in many spatial models only a single spatial random effect is analysed, that is, data is associated with one spatial unit only. In many situations, data can be nested within a hierarchy of administrative areas, for example, data can be clustered in three administrative structures such as region, district and sub-district, and each of these has some influence on the health outcome. One may be interested, therefore, to measure the spatial effect of each of these on the outcome, and moreover, failure to account for the intra-correlation may lead to biased estimates. Recent literature have proposed models to analyse these data with multilevel or multi-scale geographical structures [21, 22, 23]. Spatial random effects for each area are introduced, in a similar fashion as in a single spatial structure, to estimate variation of risk at different levels.

In this study, we employed a Bayesian hierarchical approach, by extending the approach of Vounatsou et al. [18], Dean and MacNab [21], Banerjee et al. [22] and Muggin et al. [23], to analyse patterns of childhood co-morbidity of fever, diarrhoea and pneumonia in Malawi, with data clustered within two geographical levels, that is, subdistricts and districts. Two reasons actuated the choice of these three diseases for spatial analysis. First, these, as discussed above, are of epidemiological importance in Malawi because they are the main causes of morbidity and mortality in under-five years old children. Second, the diseases share common risk factors, in particular, they are associated with environmental factors of which geographical location forms an important risk factor, thus plausible for spatial analyses. Spatial structure were modelled by employing intrinsic conditional autoregressive (CAR) models [24]. To further our understanding of potential variation in risk, various models were fitted that introduced spatial structure at one or both levels, and competing models were compared using the deviance information criterion [25].

Methods

Data

We analysed data from 4,778 under-five years of age children. The data were collected as part of the 2000 Malawi Demographic and Health Survey (DHS) [12]. A two-stage stratified sampling design was implemented to collect the data. At first stage, a total of 560 enumeration areas (EA), as defined in the 1998 Malawi Population and Housing Census, were selected stratified by urban-rural status with sampling probability proportional to the population of the EA. At the second stage, a fixed number of households were randomly selected in each EA. All women of age 15–49 years were eligible for interview.

The outcome variables were derived from self-reported sickness status of each child for the three illness (fever, diarrhoea and pneumonia), as reported by the care-givers (often mothers), experienced within 2 weeks prior to the survey date. A multi-categorical response was constructed as follows: (1) if the child experienced all three illnesses (ALL), (2) if the child was sick of both fever and diarrhoea (FD), (3) if the child had both fever and pneumonia (FP), (4) if the child had both diarrhoea and pneumonia (DP), (5) if the child experienced fever only, (6) if the child experienced diarrhoea only, (7) if the child experienced pneumonia only, and (8) if the child experienced no disease within the observation period. Since detailed single disease analyses have been dealt with elsewhere [14], our reporting shall deal with the first four diseases combinations. For completeness, results on the single diseases will be reported.

The following individual covariates were included in the analysis: (1) age of the child categorized as (a) 1–5 months, (b) 6–11 months, (c) 12–23 months, d) 24–35 months and (e) 36–59 months (reference group); (2) ownership of bednets (yes = 1, no = 0); (3) received vitamin A within 6 months prior to the survey date (yes = 1, no = 0); (4) weight-for-age (WTAGE) as a general indicator of nutritional status, measured as Z-scores, was fitted as a continuous variable; (5) type of place of residence (rural = 1, urban = 0); (6) crowding indicator based on the whether household size exceeded 6 (yes = 1, no = 0). The “no” category was the reference group for all binary variables above.

Table 1 gives a summary of the data. Individual data were nested within two areas: 364 subdistricts and 31 districts. The majority (58.5%) of children suffered one or none of the illnesses. Co-morbidity of fever and pneumonia was highest (22.2%), followed by multiple morbidity of fever, diarrhoea and pneumonia (11.1%). The corresponding proportions for fever, diarrhoea and pneumonia were 14.7, 2.7, 19.6% respectively, while 21.4% did not have any disease at the time of the survey. Very young infants (age 0–5 months) and older children (36–59 months) were proportionally less sick compared to the other age groups, across all disease combinations. Rural children were disproportionately more sick than their urban counterparts.

Table 1 Summary of data of children reported to have had all illnesses (fever, diarrhoea or pneumonia), fever and diarrhoea (FD), fever and pneumonia (FP), diarrhoea and pneumonia (DP); single disease (fever, diarrhoea, pneumonia) or none at the time of survey. Percentages are given unless stated otherwise

Model

Let Y ijk and π ijk be the sickness status and probability of multiple morbidity of fever, diarrhoea and pneumonia (k = 1), co-morbidity of fever and diarrhoea (k = 2), co-morbidity of fever and pneumonia (k = 3), co-morbidity of diarrhoea and pneumonia (k = 4), fever only (k = 5), diarrhoea only (k = 6), pneumonia only (k = 7), no disease (k = 8) of child j j = 1,...,n i in area ii = 1,...,I. We assumed that Y ijk follows a multinomial distribution, i.e., Y ijk MN(1, π ij ), where π ij  = (π ij1 ij2,...,π ij8)′. Given some covariates, x ij , and area-specific random effects, s ik , the probability of co-morbidity can be modelled as [18, 19, 20]

$$ \pi_{ijk}=\frac{\exp(\eta_{ijk})}{1+\sum_{l=1}^{k}\exp(\eta_{ijl})}, k=1,2,\cdots,7 $$
(1)

where

$$ \eta_{ijk}=x_{ij}'\beta_k+s_{ik} $$
(2)

is the a predictor. We adopt a logit link, and relative to category 8, η ijk  = log(π ijk ij8). In other words, the last category k = 8 is set such that π ij8 = 1−∑ 7 k=1 π ijk . The component β k is a vector of regression parameters for each sickness status k, and we shall refer to exp(β k ) as relative odds ratio (ROR). The random effects, s ik , are district or sub-district specific factors, and can be split into spatially structured variation (θ ik ) and unstructured multinomial heterogeneity (ϕ ik ), such that, s ik  = θ ik  + ϕ ik .

Supposing that the outcome is clustered at a sub-district and district administrative levels, then two area-specific random effects can be introduced, in Eq. (2), to model their effects. The predictor then becomes

$$ \eta_{hijk}=x_{hij}'\beta_k+s_{hik}+d_{hk} $$
(3)

for sickness status k, of child j in subdistrict i and in district h. The components s hik and d hk are area-specific random effects for the subdistrict and district respectively, which can further be split into spatially structured variation and unstructured heterogeneity.

To estimate model parameters we applied the fully Bayesian approach. The following prior distributions were specified for all parameters in the model. In modelling spatially structured random effects, an intrinsic conditional autoregressive (CAR) prior was chosen [24]. This assumes that the mean for each area θ i , conditional on the neighbouring areas, has a normal distribution with mean equal to the average of neighbouring areas θ l , and variance inversely proportional to the number of neighbours m i . Under contiguity, with w il  = 1 if areas i and l are adjacent and w il  = 0 otherwise, the CAR prior has the form,

$$ \theta_i|\{\theta_l; l\sim i\}\sim N\left(\frac{1}{m_i}\sum_{l\sim i}\theta_l,\frac{\sigma^2_{\theta}}{m_i}\right) $$
(4)

where li denotes adjacency of areas l and i on the map, σ 2θ is a spatial variance, which controls the degree of smoothness. At a further step of hierarchy σ 2θ is modelled using the inverse Gamma (IG) with known hyperparameters a = b = 0.001. This gives a weakly informative but proper prior. For moderate to large data sets results are rather insensitive to the choice of a and b. However, because of the known concerns about this prior’s possible informativity, a sensitivity analysis was carried out.

The unstructured extra-multinomial heterogeneity was estimated using an exchangeable normal prior, ϕ i N(0, σ2 ϕ), where σ2 ϕ measures the degree of heterogeneity, which again was assigned an IG hyperprior. The fixed regression coefficients were assigned diffuse priors, p k ) ∝ constant.

Analysis

A multinomial regression model was fitted to an eight-category response variable (see Data Section) and assessed the effect of fixed individual covariates (Table 1). The model was extended to incorporate extra-multinomial heterogeneity and spatially structured variation. Since the data were clustered at two geographical levels, we considered various spatial structure formations. The subdistricts were fitted as spatially structured effects, unstructured heterogeneity effects, or both combined as a Gaussian convolution prior. Similar model formulations were repeated using district as a spatial unit only. Then the two spatial levels were combined fitting spatially structured variation at sub-district level and unstructured heterogeneity effects at district levels. One may even consider two CAR random effects, one for the subdistrict and the other for the district, however, this was not estimated because a CAR prior at subdistrict is adequate to capture within and between district variation, while the exchangeable prior at district permits all other unobserved effects not modelled by the CAR at subdistrict. As a result, we analysed the following set of models:

  • Model 0: η = x Tβ

  • Model 1: η = x Tβ + ϕ subdistrict

  • Model 2: η = x Tβ + θ subdistrict  + ϕ subdistrict

  • Model 3: η = x Tβ + ϕ district

  • Model 4: η = x Tβ + θ district  + ϕ district

  • Model 5: η = x Tβ + θ subdistrict  + ϕ district

where Model 0 estimated fixed effects only, which we referred to as the null model. Model 1 added a unstructured heterogeneity effects at subdistrict level. Model 2 considered both the unstructured random effects and spatially structured variation at subdistrict level. Models 3 and 4, similar to models 1 and 2, estimated spatial effects at higher level, i.e. at district level, with an attempt to compare with gains of modelling at highly disaggregate level. Model 5 combined the unobserved effects at subdistrict and district level, as we envisage that childhood co-morbidity can be influenced by factors at all levels, i.e. individual risk factors, shared effects of immediate community (subdistrict) and greater community (district).

Model comparison was based on the Deviance Information Criterion [25]. This is given by DIC =\(\overline{D}\)p D , where \(\overline{D}\) is the deviance of the model evaluated at the posterior mean of the parameters and represents the fit of the model to the data. The component p D is the effective number of parameters, which assess the complexity of the model. Since small values of \(\overline{D}\) indicate good fit while small values of p D indicate a parsimonious model, small values of DIC indicate a better model. Models with differences in DIC of < 3 compared with the best model can not be distinguished, while those between 3–7 can be weakly differentiated [25, p. 613].

The models were estimated in \({\tt BayesX}\) 1.4 [26], using Markov Chain Monte Carlo (MCMC) simulation techniques. For all the models, we ran 35,000 iterations, with the first 5,000 discarded and sub-sampled every 30th observation. This gave a final sample of 1,000 for which model parameters were estimated. A sensitivity analysis was performed by changing the prior distributions for the variance components σ2 θ and σ2 ϕ, assuming a variety of different inverse gamma priors. In particular, the following specifications: IG(0.001,0.001), IG(0.01,0.01), IG(0.5,0.0005) and IG(1,0.026), of different degrees of uncertainty, commonly used in disease mapping literature were examined, and the results gave relatively similar inference on risks of morbidity, variance components and model fit.

Results

Tables 2 and 3 give values for model fit and complexity for the co-morbidity health outcomes (i.e. the first four categories) only. The null model (Model 0) was the least complex and fitted poorly. Models 1 and 2 provided an improved fit, but at increased complexity. Including random effects in the model improved the fit of the model and generally reduced the DIC, however in model 1, adding unstructured subdistrict heterogeneity effects increased DIC because the model increased in complexity (p D  = 253.40), and offset the gains made in model fit (\(\overline{D}=10196.68\)). Comparing model 1 to model 3, which included district effects, we observed that model 3 offered a slightly better fit (Table 3). This can be explained by less complexity of the model (p D  = 253.40 in model 1 versus 100.49 in model 3). The variance components for the random effects (σ 2ϕ (subdistrict)) in model 1 were 0.29, 0.17, 0.21 and 0.45 for outcome categories I (fever, diarrhoea, pneumonia), II (fever, diarrhoea), III (fever, pneumonia), and IV (diarrhoea, pneumonia) respectively, which were similar to model 3 (σ 2ϕ (district)): 0.23, 0.18, 0.16, 0.37 for outcome categories I (fever, diarrhoea, pneumonia), II (fever, diarrhoea), III (fever, pneumonia), and IV (diarrhoea, pneumonia) respectively.

Table 2 Measures of model fit and complexity and estimates of random effects at subdistrict levela
Table 3 Measures of model fit and complexity and estimates of random effects at subdistrict and district levelsa

Including spatially structured subdistrict effects in Model 2, the model fit improved significantly (\(\overline{D}=10181.19\)) and the DIC decreased (10417.18). The variance components for the heterogeneity terms (σ 2ϕ (subdistrict)) were reduced to 0.02, 0.07, 0.05 and 0.42 for outcome category I (fever, diarrhoea, pneumonia), II (fever, diarrhoea), III (fever, pneumonia), and IV (diarrhoea, pneumonia) respectively. For the spatial effects, the variance components (σ 2θ (subdistrict)) were 0.66, 0.30, 0.42 and 0.14 for category I (fever, diarrhoea, pneumonia), II (fever, diarrhoea), III (fever, pneumonia), and IV (diarrhoea, pneumonia) respectively. In model 4 (Table 3), district spatial effects were added to the unstructured heterogeneity terms in model 3, and these explained ρ = 0.79, 0.75, 0.92 and 0.40 of the total spatial variability for each category. With regards the DIC, model 2 was better than model 4. Fitting district as an unstructured random effect and subdistrict as a structured spatial effect (model 5) improved the model in terms of the DIC (10402.31). In fact this was the best model.

Table 4 provides estimates for the fixed effects based on the best model (model 5). The second column gives the estimated odds ratios for probability of co-morbidity of fever, diarrhoea and pneumonia versus observing none of the illnesses within the 2-week period preceding the survey date. With regard to this co-morbidity, children aged 0–5, 6–11, 12–23 or 24–35 months were at high risk relative to children aged 36–59 months. Rural children were at increased risk of co-morbidity of fever, diarrhoea and pneumonia relative to urban children (ROR = 1.32, 95% CI: 1.13, 1.55). Weight-for-age were associated with reduced risk (ROR: 0.78, 95% CI: 0.69, 0.86). Figure 1 displays the residual spatial effects at subdistrict level for co-morbidity of fever, diarrhoea and pneumonia. The left map shows ROR, which ranged from 0.55 (white colour) to 2.45 (black colour). The south-eastern and central regions showed high risk, while the northern region was at lower risk. This is validated by the corresponding probability map (right panel), which shows areas of excess risk at nominal value of 80%, a decision rule based on Richardson et al. [8].

Table 4 Regression coefficient summaries for the best model (Model 5) fitted to data on children co-morbidity of diarrhoea, pneumonia and fever
Fig. 1
figure 1

Residual spatial effects at sub-district level (I. all three illnesses (fever, diarrhoea and pneumonia) concurrently versus no/one illness only). Shown are the relative risk ratio (ROR) on the left map. Right map shows corresponding posterior probabilities of ROR >1: <20 per cent white, 20–80 per cent grey, >80 per cent black

Fixed effects estimates for co-morbidity of fever and diarrhoea versus observing none are also given in Table 4. The risk was high for children aged 0–5, 6–11, 12–23 and 24–35 months relative to children aged 36–59 months. However, rural children compared to urban children were at same risk of fever-diarrhoea co-morbidity (ROR: 1.14, 95% CI: 0.95, 1.41). Weight-for-age was again associated with reduced the risk of co-infection with fever and diarrhoea (ROR: 0.71, 95% CI: 0.61, 0.83). Figure 2 shows the spatial effects of co-infection of fever and diarrhoea relative to one or no infection. Relative odds ratio ranged from 0.85 to 1.87 (left map). Although the northern and central region were at lower risk, while the southern tip was at increased risk, none of the areas showed excess risk compared to the overall mean risk (ROR = 1), as evidenced by the corresponding probabilities map (Fig. 2 right panel).

Fig. 2
figure 2

Residual spatial effects at sub-district level (II. fever and diarrhoea concurrently versus no/one illness only). Shown are the relative risk ratio (ROR) on the left map. Right map shows corresponding posterior probabilities of ROR >1: <20 per cent white, 20–80 per cent grey, >80 per cent black

The risk factors of fever and pneumonia co-morbidity versus no infection were age of the child, place of residence, and vitamin A supplement (Table 4). Children at age 0–5, 6–11, 12–23 or 24–35 months were at increased risk compared to those aged 36–59 months. The risk was higher for rural children compared to urban children (ROR: 1.27, 95% CI: 1.15, 1.41). Those who received vitamin A supplement compared to those who did not were at increased risk (ROR: 1.11, 95% CI: 1.02, 1.20). Perhaps it is not surprising that those who received vitamin A have an increased risk because it might be that they were given vitamin A because of initial higher risk. Figure 3 gives spatial effects at subdistrict level. The highest risk was in the south-eastern and central region (ROR = 1.46). Lowest risk was at the northern tip (ROR = 0.67). However, none of the areas registered excess risk at 80% nominal value.

Fig. 3
figure 3

Residual spatial effects at sub-district level (III. fever and pneumonia concurrently versus no/one illness only). Shown are the relative risk ratio (ROR) on the left map. Right map shows corresponding posterior probabilities of ROR >1: <20 per cent white, 20–80 per cent grey, >80 per cent black

The last column in Table 4 gives estimates for fixed effects of diarrhoea and pneumonia co-morbidity. The risk was again associated with child’s age, owning and using a bednet, and underweight. At ages 0–5, 6–11 and 12–23 months relative to 36–59 months the risk of having diarrhoea and pneumonia were high (ROR = 1.59, 95% CI: 1.13, 2.16; ROR = 2.22, 95% CI: 1.69, 2.88; ROR = 1.88, 95% CI: 1.49, 2.45 respectively). Those aged 24–35 months were not significantly different from those aged 36–59 months (ROR: 1.15, 95% CI: 0.85, 1.53). Similarly, underweight children were at increased risk compared to the average (ROR: 0.73, 95% CI: 0.60, 0.89). Children in households possessing a bed net were at increased risk compared to those from households without a bednet. Children using net were at reduced risk of diarrhoea and pneumonia compared to those who did not (ROR: 0.48, 95% CI: 0.32, 0.75). The unexpected result of bednet possession increasing the risk of co-morbidity does suggest that having a bednet does not always translate unto usage. Bednet ownership, unlike usage, is a poor indicator of protection against mosquito-transmitted diseases, and may therefore give unexpected results. Figure 4 gives maps of spatial effects. Evidently, there was no area of significant excess risk as the estimated ROR ranged between 0.98 and 1.01.

Fig. 4
figure 4

Residual spatial effects at sub-district level (IV. diarrhoea and pneumonia concurrently versus no/one illness only). Shown are the relative risk ratio (ROR) on the left map. Right map shows corresponding posterior probabilities of ROR >1: <20 per cent white, 20–80 per cent grey, >80 per cent black

Finally in Table 5, fixed effects on the single diseases are considered. The risk of childhood fever increased with rural residence relative to urban residence. Children aged 0–5, 6–11, 12–23, 24–35 months relative to 36–59 months were at higher risk of fever. Those who received vitamin A relative to those who did not were at increased risk of fever. Net ownership, usage and weight for age were associated with lower risk of fever. Risk factors positively associated with diarrhoea were rural place of residence, and all age groups. Lower risk of diarrhoea was observed with bed nets ownership, usage and weight for age. Pneumonia was positively associated with all age groups, rural type of residence, weight for age, and vitamin A uptake, while low risk was observed with bednet usage. The effect size for the risk factors changed when one compares mono-morbidities to co-morbidities. For example, the age effects were relatively higher in mono-morbidities than in co-morbidities. Similarly, the effects of residence were significant for mono-infections of fever and pneumonia, and the same effects were reflected in the in fever and pneumonia co-infections, e.g. categories I (fever, diarrhoea, pneumonia) and III (fever, pneumonia), and none where diarrhoea combines with others, e.g. categories II (fever, diarrhoea) and IV (diarrhoea, pneumonia). Clearly, common increased risk effects produce a correspondingly increased risk on the disease co-morbidities.

Table 5 Regression coefficient summaries for the best model (Model 5) fitted to data on children mono-morbidities of fever, diarrhoea and pneumonia

Discussion

The central question in this study was to identify areas of high and low risk of childhood co-morbidities of fever, diarrhoea and pneumonia in Malawi, after adjusting for individual specific risk factors. We analysed self-reported health data, realised in a cross-sectional national-wide survey, to measure residual spatial patterns at both district and sub-district levels. Our results provide evidence of geographical impact of location on childhood health and these can be compared with potential environmental risk factors. Our approach used a mixed multinomial logit model to analyse different combinations of co-morbidity of fever, diarrhoea and pneumonia. This formulation of structuring binary to multi-categorical response variable is appropriate for the three diseases considering that these epidemiologically overlap [5].

The overall co-morbidity prevalence of fever–diarrhoea–pneumonia, fever–diarrhoea, fever–pneumonia, and diarrhoea–pneumonia were found to be 11, 5, 22 and 3% respectively in this sample of under-five children. There was considerable spatial correlation for the four co-morbidity outcomes as evidenced by the maps. Residual risk estimates ranged from 0.55 to 2.45 for fever-diarrhoea-pneumonia, from 0.85 to 1.87 for fever–diarrhoea, from 0.67 to 1.46 for fever–pneumonia, and from 0.98 to 1.01 for diarrhoea–pneumonia co-morbidities. This confirms our initial thought of strong spatial dependence at subdistrict or districts levels. From the analysis we were able to establish that the northern region was relatively at lower risk for all four response outcomes, while the central and south-eastern regions were at higher risk of childhood co-morbidity.

Nevertheless, the residual spatial variation were significantly weak as demonstrated by the corresponding probability maps. Lack of excess risk in spatial effects can be explained by the fact that much of the total variability in the responses has already been accounted for by the fixed individual covariates. Various risk factors affect the health outcomes differently, and thus affect the geographical extent and spread of childhood co-morbidity in different areas of the country. Thus the maps show the overall effect of these (and any latent or unobserved) factors, and essentially serve to highlight areas of excess risk, whatsoever the cause.

Further research, therefore, is needed to disentangle actual risk factors contributing to spatial co-morbidity of these diseases. For example, the residual maps can be compared with HIV risk map when such data become available. For a country with high prevalence of HIV [27], the relationship observed between fever, diarrhoea and pneumonia may be due to the fact that symptoms of HIV include fever and diarrhoea. In addition, pneumonia is a common opportunistic infection associated with HIV infection. Therefore, HIV prevalence remains a potential risk factor which may explain spatially structured residual variation in childhood morbidities.

One may also link malaria estimates, probably derived from malaria risk map [28], to DHS data that measure children’s health and investigate whether the spatial residual effects are attenuated when these are adjusted for in the model. Indeed, the clustering of fever and diarrhoea risk in the central and south-eastern region (Figs. 1 and 2), have been attributed to the unmeasured effect of malaria risk [13, 14]. In fact, previous work in the area shows that malaria risk is higher in the central and southern regions than in other areas [29]. This coupled with the lack of sustainable malaria control programmes or low coverage of antimalarial interventions such as insecticide treated bednets [28], means that perennial malaria risk may be responsible for adverse co-morbidities of fever, diarrhoea and pneumonia [1, 5]. In highly malaria endemic areas like Malawi, the disease presents in various forms, ranging from asymptomatic parasitemia which may present as unspecific fever to complicated and severe malaria. Thus, a bout of malaria may trigger other opportunistic diseases such as pneumonia.

In addition, social, cultural and other environmental factors which may impose a cumulative effect on childhood health may be worthwhile investigating, particularly in high risk clusters. For instance, flooded plains in the south lead to food shortages, resulting in childhood malnutrition. This is another important risk factor of childhood co-morbidity and mortality [30], which may warrant further research. Increased population density in both central and southern region, leading to over-cultivated land and severe food shortages, thus inducing deep poverty [31], is likely to contribute to higher risk of childhood co-morbidity of diarrhoea, pneumonia and fever, and other opportunistic infections. Variabilities in socio-economic status and sanitation may also explain the spatial variation in childhood co-infections. The World Bank report of 2000, reported that districts in the southern part of the country were worst-off as regards deprivation [32]. Our analysis showed a clear geographical clustering of high risk in the four disease combinations in these areas.

Having accounted for spatial heterogeneity, the paper also provides evidence that individual risk factors influence the pattern of the disease. The results show that age of the child and place of residence (rural or urban) are important predictors not only for the co-morbidities but also for single diseases. The effect of age is particularly interesting. Generally children at all ages were at increased risk, however children who were of an age where they were likely to be weaned (6–23 months) appeared to be at the greatest risk. Very young infants (0–5 months) may have been breastfed, and therefore protected by maternal immunity, and older children were less at risk of disease, probably because of acquired immunity. The results emphasize the need for interventions targeted at this group [7], and may include micronutrient supplements, e.g. vitamin A, and use of insecticide treated nets, and combined interventions would be cost-effective to implement [7]. Rural areas and poor neighbourhoods are particularly vulnerable and deserve attention when scaling-up interventions [1].

Our modelling approach to co-morbidity used the multinomial model, however, it is possible to analyse the spatially clustered binary outcomes using multivariate spatial models [8, 9, 15, 16]. Although both the multinomial and multivariate approaches allow simultaneous analysis of events, the multivariate model evokes the multivariate CAR model [15, 16]. The advantage of multinomial models is that we are still working within the univariate formulation only that the response is multi-categorical. Another attractive feature of this approach is that it offers a separate treatment of co-morbidities versus a control group (the “none” category). Thus the interpretation of the fixed effects in easy because we are able to quantify the effect of the covariates on each disease and on the combined diseases and compare them with the none category.

This study may be seen as dwelling on some measure of “severity”, and a similar multinomial approach has been used to map Schistosoma mansoni-hookworm co-infection [10]. However, the multinomial approach does not model the geographical correlation between diseases, which may be of interest in other epidemiological investigations. Another disadvantage is that as the number of dependent variables increase, say beyond three diseases, the number of categories to be estimated also expand rapidly, making it difficult to estimate and interpret the results. This is where the multivariate spatial approach may offer many advantages. We are currently investigating the use of multivariate spatial models to analyse patterns of childhood co-morbidity and associated covariates while controlling for the estimated correlation among the diseases, with spatial correlation modelled by the geostatistical approach. A good starting point is \({\tt WinBUGS}\) [33], which provides better off-the-shelf MCMC fitting of these models. In fact, one may adapt the sample code currently available in the package.

In our analysis, spatial correlation was modelled using the CAR prior, however, this is sensitive to the hyperprior specification of the the variance σ2 θ. Similar behaviour applies to the variance components of the exchangeable prior σ2 ϕ, and sensitivity analysis should be performed as carried out in this analysis. However, use of weakly informative prior distributions, for moderate to large datasets, is likely to produce more stable estimates [34, 35]. Gelman [36] proposes use of other priors on the variance such as the uniform prior, e.g. U(0.001, 1000), or more informative prior such as a truncated t-distribution, which are well behaved than the inverse-Gamma distribution under hierarchical models.

The data used in this analysis was based on self-reported accounts by mothers. Self-reported illnesses suffer some limitations. The outcomes are dependent of the mothers’ recall, and may lead to bias, although, limiting the recall period to 14 days reduces the bias [37].

In conclusion, this analysis investigated geographical patterns of childhood morbidities of fever, diarrhoea and pneumonia, adjusting for individual-specific risk factors. While there are several other studies that considered childhood morbidity, our analysis demonstrated that a number of diseases, i.e., fever, diarrhoea and pneumonia epidemiologically overlap and that the pattern is spatially structured. The emphasis is that over and above individual-specific risk factors, latent and unobserved risk factors directly influence childhood co-morbidity.

Although this spatial analysis is not exhaustive and does not involve use of all data collected, it is hoped that the spatial analysis will assist in policy development and planning, advocacy, resource allocation, implementation and monitoring and evaluation. Currently a number of interventions for these common illnesses, for example that of fever and pneumonia, are being implemented and mostly these are being applied separately. Cost-effective implementation of control can be achieved if some of these interventions are applied in an integrated manner, probably through simultaneous spatial targeting of resources. In fact, the Integrated Management of Childhood Illnesses strategies now recommend multi-faceted targeting of interventions. The geographical impact of location when implementing interventions must be recognized as it affects the epidemiology of diseases or interventions coverage.