Abstract
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agarwal DK, Gelfand AE, Citron-Pousty S (2002) Zero-inflated models with application to spatial count data. Environ Ecol Stat 9:341–355
Angers JF, Biswas A (2003) A Bayesian analysis of zero-inflated generalized Poisson model. Comput Stat Data Anal 42:37–46
Banerjee S, Carlin B, Gelfand A (2004) Hierarchical modeling and analysis for spatial data. Chapman & Hall/CRC, New York
Besag J, Kooperberg C (1995) On conditional and intrinsic autoregressions. Biometrika 82:733–746
Brier G (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78 (1):1–3
Consul P (1989) Generalized Poisson distributions. Properties and Applications. Marcel Dekker, New York
Consul P, Jain G (1973) A generalization of the Poisson distribution. Technometrics 15:791–799
Czado C, Prokopenko S (2004) Modeling transport mode decisions using hierarchical binary spatial regression models with cluster effects. Discussion paper 406, SFB 386 Statistische Analyse diskreter Strukturen Http://www.stat.uni-muenchen.de/sfb386/
Famoye F, Singh K (2003a) On inflated generalized Poisson regression models. Adv Appl Stat 3(2):145–158
Famoye F, Singh K (2003b) Zero inflated generalized Poisson regression model (submitted)
Gelman A, Carlin J, Stern H, Rubin D (2004) Bayesian data analysis, 2nd edn. Chapman & Hall/CRC, Boca Raton
Gilks W, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2):337–348
Gilks W, Richardson S, Spiegelhalter D (1996) Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC, Boca Raton
Gneiting T, Raftery AE (2004) Strictly proper scoring rules, prediction and estimation. Technical Report no. 463, Department of Statistics, University of Washington
Gschlößl S (2006) Hierarchical Bayesian spatial regression models with applications to non-life insurance. PhD thesis, Munich University of Technology
Han C, Carlin B (2001) Markov chain Monte Carlo methods for computing Bayes factors: a comparative review. J Am Stat Assoc 96:1122–1132
Hoeting J, Madigan D, Raftery A, Volinsky C (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–417
Jin X, Carlin B, Banerjee S (2005) Generalized hierarchical multivariate CAR models for areal data. Biometrics 61:950–961
Joe H, Zhu R (2005) Generalized Poisson distribution: the property of mixture of Poisson and comparison with Negative Binomial distribution. Biometric J 47:219–229
Kass R, Raftery A (1995) Bayes factors and model uncertainty. J Am Stat Assoc 90:773–795
Lambert D (1992) Zero-inflated Poisson regression with and application to defects in manufacturing. Technometrics 34(1):1–14
van der Linde A (2005) DIC in variable selection. Statistica Neerlandica 59(1):45–56
Pettitt A, Weir I, Hart A (2002) A conditional autoregressive Gaussian process for irregularly spaced multivariate data with application to modelling large sets of binary data. Stat Comput 12(4):353–367
Rodrigues J (2003) Bayesian analysis of zero-inflated distributions. Commun Stat 32(2):281–289
Spiegelhalter D, Best N, Carlin B, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64(4):583–640
Sun D, Tsutakawa RK, Kim H, He Z (2000) Bayesian analysis of mortality rates with disease maps. Stat Med 19:2015–2035
Winkelmann R (2003) Econometric analysis of count data, 4th edn. Springer, Berlin Heidelberg, Germany
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gschlößl, S., Czado, C. Modelling count data with overdispersion and spatial effects. Statistical Papers 49, 531–552 (2008). https://doi.org/10.1007/s00362-006-0031-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-006-0031-6