Abstract
The Iraq conflict is one of the most outrageous and unprovoked aggressions unleashed by the West. Here, we provide a statistical analysis of the number of civilians deaths resulting from the US-led invasion. For this purpose, we propose several new discrete distributions. The distributions are fitted to the data on the number of deaths by maximum likelihood. Variables like province, cause of death and time are taken as covariates. Useful predictions are given on the number of deaths.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The invasion of Iraq led by US forces began on 20 March 2003. Other countries also sent forces to Iraq, including the UK, South Korea, Italy, Poland, Australia, Georgia, Ukraine, Netherlands, and Spain.
Much of the evidence for Iraq war was based on weapons of mass destruction (including Yellowcake uranium, Poison gas and biological weapons), connections to anthrax attacks and connections to the 11 September attacks. As often the case in the West, most of this evidence was fabricated and found to have no substance. The Iraq conflict has led to over one million deaths, including deaths of over 100,000 civilians.
US led forces also committed numerous human right abuses during the conflict, including Abu Ghraib torture and prisoner abuse, Haditha killings of 24 civilians, white phosphorus use, gang-rape and murder of a 14-year-old girl and the murder of her family in Mahmoudiyah, the torture and killing of prisoner of war, Iraqi Air Force commander, Abed Hamed Mowhoush, the killing of Baha Mousa, Mukaradeeb wedding party massacre of 42 civilians, and Blackwater Baghdad shootings.
There has been considerable academic interest in the Iraq conflict and its effect. Most of the academic papers published focus on issues relating to the military personnel, the perpetrators of the unjustified, bloody, and criminal invasion. For example, Nason and Bailey (2008) propose an approach for estimating intensity of deaths of coalition personnel. We question the morality of these and other authors. In our opinion, these authors and their research are as criminal as the invasion itself.
There have been very few papers investigating civilian deaths from the Iraq conflict. The only one we are aware of is Lewis et al. (2012). The main conclusion of this paper is: “Our results indicate that self-excitation makes up as much as 37-50 percent of all violent events and that self-excitation lasts at most between two and six weeks, depending upon the district in question”. It is not clear to us what practical implication that this conclusion has.
The aim of this paper to provide a statistical analysis of civilian deaths from the Iraq conflict. This paper appears to be the first of its kind with respect to the Iraq conflict.
The contents of this paper are organized as follows. The data on the number of civilian deaths are described in Sect. 2. A range of discrete distributions for modeling the data is listed in Sect. 3. Many of these distributions are new. Statistical modeling of the data is described in Sect. 4. Some conclusions of this modeling exercise are noted in Sect. 5.
2 Data
The data for this paper was extracted from http://www.iraqbodycount.org/, a website giving civilian deaths of the Iraq conflict since 2003. We extracted the maximum number of civilians killed biyearly in each of the 18 provinces of Iraq (Baghdad, Anbar, Babylon, Basrah, Dahuk, Diyala, Erbil, Kerbala, Missan, Muthanna, Najaf, Ninewa, Qadissiya, Salah al-Din, Sulaymaniyah, Tameem, Thi-Qar, Wassit) by US-led coalition only or US-led coalition including Iraqi forces using explosives, air attacks, gunfire or suicide attacks.
The number of civilians killed was also given weekly, monthly, quarterly and yearly. But the weekly, monthly and quarterly data exhibited significant serial correlation. The biyearly data did not show significant serial correlations. The yearly data were thought to be too few for statistical analysis.
The website http://www.iraqbodycount.org/ also reported civilian deaths due to Iraqi state forces without coalition, anti-government/occupation forces and others. We did not consider these data since the purpose here is to investigate the effect of Western aggression in Iraq.
Figure 1 shows the distribution of the number of deaths versus the provinces. The number of deaths appears largest for Anbar in terms of median and variability. It appears smallest for Dahuk, Erbil and Tameem in terms of median and variability.
Figure 2 shows the distribution of the number of deaths versus the cause. The number of deaths appears largest due to gunfire, second largest due to air attacks, third largest due to explosives and smallest due to suicide attacks.
Both Figs. 1 and 2 suggest that the number of deaths appears larger at least in terms of variability when the perpetrators are US-led coalition with Iraqi forces (as opposed to US-led coalition only).
3 Models
The data are counts. So, discrete distributions are needed to model them. Unfortunately, most if not all of the discrete distributions available in the literature have limited applicability (Johnson et al. 1992). Here, we list a range of discrete distributions that can be used to model the data. Of the 20 discrete distributions listed, the first 10 are known ones. The remaining 10 discrete distributions (generalized discrete Pareto, discrete Fréchet, discrete lognormal, discrete F, discrete inverse gamma, discrete inverse Gaussian, discrete Birnbaum Saunders, discrete half t, discrete half Cauchy, and discrete half logistic) are new.
The list of 20 distributions includes both light- and heavy-tailed distributions. The Poisson, geometric, logarithmic, Yule, discrete gamma, discrete Weibull, discrete half normal, discrete lognormal, discrete inverse Gaussian, discrete Birnbaum Saunders and discrete half logistic distributions have light tails. The discrete inverse Weibull, Zeta, discrete Burr, generalized discrete Pareto, discrete Fréchet, discrete F, discrete inverse gamma, discrete half t, and discrete half Cauchy distributions have heavy tails.
3.1 Poisson distribution
This distribution is well known and has its probability mass function (pmf) specified by
for \(\lambda >0,\) the rate parameter, and \(x=0,\,1,\ldots \)
3.2 Geometric distribution
This distribution is well known and has its pmf specified by
for \(0<p<1,\) the probability parameter, and \(x=0,\,1,\ldots \)
3.3 Logarithmic distribution
This distribution due to Fisher et al. (1943) has its pmf specified by
for \(0<p<1,\) the probability parameter, and \(x = 1,\,2,\ldots \)
3.4 Yule distribution
This distribution due to Yule (1925) has its pmf specified by
for \(\rho >0,\) the shape parameter, and \(x = 1,\,2,\ldots ,\) where \(B(a,\,b) = \int \nolimits _0^1 t^{a - 1}(1 - t)^{b - 1}dt\) denotes the beta function.
3.5 Discrete gamma distribution
This distribution due to Yang (1994) has its pmf specified by
for \(\sigma >0,\) the scale parameter, \(\xi >0,\) the shape parameter, and \(x = 0,\,1,\ldots ,\) where \(\gamma (a,\,x) = \int \nolimits _0^x t^{a - 1} \exp (-t) dt\) denotes the incomplete gamma function and \(\varGamma (a) = \int \nolimits _0^\infty t^{a - 1} \exp (-t)dt\) denotes the gamma function.
3.6 Discrete Weibull distribution
This distribution due to Nakagawa and Osaki (1975) has its pmf specified by
for \(0<q<1,\,\theta >0\) and \(x = 0,\,1,\ldots \) Here, both q and \(\theta \) are shape parameters.
3.7 Discrete inverse Weibull distribution
This distribution due to Jazi et al. (2010) has its pmf specified by
for \(0<q<1\) and \(\theta >0.\) Here, both q and \(\theta \) are shape parameters.
3.8 Zeta distribution
This is a known distribution with its pmf specified by
for \(s > 1,\) the shape parameter, and \(x = 1,\,2,\ldots ,\) where
denotes the Riemann zeta function.
3.9 Discrete half normal distribution
This distribution due to Gómez-Déni (2012) has its pmf specified by
for \(\sigma > 0,\) the scale parameter, and \(x = 0,\,1,\ldots ,\) where \(\varPhi (\cdot )\) denotes the cumulative distribution function of a standard normal random variable.
3.10 Discrete Burr distribution
This distribution due to Krishna and Pundir (2009) has its pmf specified by
for \(a > 0,\,b > 0\) and \(x = 0,\,1,\ldots \) Here, both a and b are shape parameters.
3.11 Generalized discrete Pareto distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(-\infty < \xi < \infty ,\) the shape parameter, and \(x = 0,\,1,\ldots \)
3.12 Discrete Fréchet distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(\xi > 0,\) the shape parameter, and \(x = 0,\,1,\ldots \)
3.13 Discrete lognormal distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(-\infty < \mu < \infty ,\) the location parameter, and \(x = 0,\,1,\ldots \)
3.14 Discrete \(F\) distribution
This distribution is new and has its pmf specified by
for \(\nu _1 > 0,\) the first degree of freedom parameter, \(\nu _2 > 0,\) the second degree of freedom parameter, and \(x = 0,\,1,\ldots ,\) where \(I_x(a,\,b)=\int \nolimits _0^x t^{a - 1} (1 - t)^{b - 1} dt/B(a,\,b)\) denotes the incomplete beta function ratio.
3.15 Discrete inverse gamma distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(\xi > 0,\) the shape parameter, and \(x = 0,\,1,\ldots ,\) where \(\varGamma (a,\,x) =\int \nolimits _x^\infty t^{a - 1} \exp (-t) dt\) denotes the complementary incomplete gamma function.
3.16 Discrete inverse Gaussian distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(\xi > 0,\) the shape parameter, and \(x = 0,\,1,\ldots \)
3.17 Discrete Birnbaum Saunders distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, \(\xi > 0,\) the shape parameter, and \(x = 0,\,1,\ldots \)
3.18 Discrete half \(t\) distribution
This distribution is new and has its pmf specified by
for \(\nu > 0,\) the degree of freedom parameter, and \(x = 0,\,1,\ldots ,\) where
denotes the Gauss hypergeometric function, where \((f)_k = f (f + 1)\cdots (f + k - 1)\) denotes the ascending factorial.
3.19 Discrete half Cauchy distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, and \(x = 0,\,1,\ldots \)
3.20 Discrete half logistic distribution
This distribution is new and has its pmf specified by
for \(\sigma > 0,\) the scale parameter, and \(x = 0,\,1,\ldots \)
Note that we have simply listed the pmf for each of the 20 discrete distributions, including the 10 new ones. We have not attempted to derive structural properties like moments or procedures for maximum likelihood estimation. These are not needed in subsequent sections. A detailed study of mathematical properties of the 10 new distributions is a possible future work.
4 Results
Section 4.1 determines the best of the 20 distributions in Sect. 3 to model number of civilian deaths. Section 4.2 investigates the effect of province on the number of deaths. Section 4.3 investigates the effect of cause on the number of deaths. Section 4.4 investigates the effect of time on the number of deaths. Section 4.5 provides some useful predictions on the number of deaths.
4.1 Best fitting model
The first step is to see which of the 20 distributions in Sect. 3 gives the best fit for the data. We fitted all of the distributions to the data by ignoring the groupings into provinces and the groupings into causes of death. The parameter estimates, standard errors, loglikelihood values, values of Akaike information criterion (1974) and values of Bayesian information criterion (Schwarz 1978) are shown in Table 1 when the perpetrators are US-led coalition only. The parameter estimates, standard errors, loglikelihood values, values of AIC and values of BIC are shown in Table 2 when the perpetrators are US-led coalition with Iraqi forces. The method of maximum likelihood was used for fitting. The standard errors were computed by inverting the observed information matrices.
Among the one-parameter models, the logarithmic distribution gives the smallest loglikelihood value, the smallest AIC value and the smallest BIC value when the perpetrators are US-led coalition only. The discrete half Cauchy distribution gives the smallest loglikelihood value, the smallest AIC value and the smallest BIC value when the perpetrators are US-led coalition with Iraqi forces.
Among the two-parameter models, the discrete Birnbaum Saunders distribution gives the smallest loglikelihood value, the smallest AIC value and the smallest BIC value when the perpetrators are both US-led coalition only and US-led coalition with Iraqi forces.
Overall, the discrete Birnbaum Saunders distribution gives the smallest loglikelihood value, the smallest AIC value and the smallest BIC value when the perpetrators are both US-led coalition only and US-led coalition with Iraqi forces. This distribution is one of the newly proposed distributions in Sect. 3.
4.2 Effect of provinces
We investigate the effect of provinces on the number of deaths. The discrete Birnbaum Saunders distribution has two parameters: the shape parameter, \(\xi ,\) and the scale parameter, \(\sigma .\) We fitted the following models:
- Model 1::
-
\(\xi \) is the same for each province, \(\sigma \) is the same for each province;
- Model 2::
-
\(\xi \) is the same for each province, \(\sigma \) is different for each province;
- Model 3::
-
\(\xi \) is different for each province, \(\sigma \) is the same for each province;
- Model 4::
-
\(\xi \) is different for each province, \(\sigma \) is different for each province.
We obtained the values of \(-\log L = 869.0,\,825.6,\,834.9\) and 783.5 for Models 1–4, respectively, when the perpetrators are US-led coalition only. The values of \(-\log L = 1131.3,\,1066.9,\,1091.0\) and 997.3 were obtained for Models 1–4, respectively, when the perpetrators are US-led coalition with Iraqi forces. It follows by the standard likelihood ratio test that both the shape and scale parameters are different for each province.
4.3 Effect of causes of death
We investigate the effect of the cause of death (explosives, air attacks, gunfire or suicide attacks) on the number of deaths. We fitted the following models:
- Model 1::
-
\(\xi \) is the same for each cause, \(\sigma \) is the same for each cause;
- Model 2::
-
\(\xi \) is the same for each cause, \(\sigma \) is different for each cause;
- Model 3::
-
\(\xi \) is different for each cause, \(\sigma \) is the same for each cause;
- Model 4::
-
\(\xi \) is different for each cause, \(\sigma \) is different for each cause.
We obtained the values of \(-\log L = 285.5,\,284.1,\,284.7\) and 271.4 for Models 1–4, respectively, when the perpetrators are US-led coalition only. The values of \(-\log L = 359.5,\,355.8,\,358.0\) and 324.5 were obtained for Models 1–4, respectively, when the perpetrators are US-led coalition with Iraqi forces. It follows by the standard likelihood ratio test that both the shape and scale parameters are different for each cause.
4.4 Effect of time
We seek how the number of deaths varies with respect to time. Scatter plots of the data not shown here suggest that the predominant pattern is a decrease in the number of deaths with respect to time. A decrease can be represented by several mathematical forms. A simplest form is a linear one. We fitted the following models:
- Model 1::
-
\(\sigma = \exp (a),\,\xi = \exp (b);\)
- Model 2::
-
\(\sigma = \exp (a),\,\xi = \exp (b + c\times \mathrm{time});\)
- Model 3::
-
\(\sigma = \exp (a + c \times \mathrm{time}),\,\xi = \exp (b);\)
- Model 4::
-
\(\sigma = \exp (a + c \times \mathrm{time}),\,\xi = \exp (b + c \times \mathrm{time});\)
- Model 5::
-
\(\sigma = \exp (a + b \times \mathrm{time}),\,\xi = \exp (a + c \times \mathrm{time});\)
- Model 6::
-
\(\sigma = \exp (a + b \times \mathrm{time}),\,\xi = \exp (c + d \times \mathrm{time}).\)
Time is in the units of a 6-month period. The exponentiation is used because both the scale and shape parameters are positive by definition. Model 1 supposes that both parameters are independent of time. Model 2 supposes that the shape parameter varies linearly with respect to time but the scale parameter remains independent. Model 3 supposes that the scale parameter varies linearly with respect to time but the shape parameter remains independent. Model 4 supposes that both parameters vary linearly with respect to time but with the same slope. Model 5 supposes that both parameters vary linearly with respect to time but with the same intercept. Model 6 supposes that both parameters vary linearly with respect to time with no restrictions on slope or intercept.
We fitted Models 1–6 to data from each province and to data corresponding to each cause of death when the perpetrators are US-led coalition only or US-led coalition with Iraqi forces. The best fitting models as determined by the likelihood ratio test are shown in Tables 3, 4, 5 and 6.
The number of deaths in Baghdad, Anbar, Babylon, Basrah, Diyala, Ninewa, Tameem and Thi-Qar shows a decreasing trend (the slope parameter for \(\sigma \) is significantly negative). The number of deaths in Dahuk and Salah al-Din shows an increasing trend (the slope parameter for \(\sigma \) is significantly positive). The number of deaths in Erbil, Missan, Qadissiya and Sulaymaniyah does not appear to show significant changes (neither of the slope parameters are significantly different from zero).
For Kerbala, Najaf and Wassit, the number of deaths shows a decreasing trend when the perpetrators are US-led coalition only and does not appear to show significant changes when the perpetrators are US-led coalition with Iraqi forces.
For Muthanna, the number of deaths shows a decreasing trend when the perpetrators are US-led coalition with Iraqi forces and does not appear to show significant changes when the perpetrators are US-led coalition only.
The number of deaths due to explosives, air attacks and gunfire shows a decreasing trend (the slope parameter for \(\sigma \) is significantly negative). The number of deaths due to suicide attacks does not appear to show significant changes (neither of the slope parameters are significantly different from zero).
4.5 Predictions
Using the best fitting models in Sect. 4.4, one can give useful predictions on the number of deaths into the future. Tables 7, 8, 9 and 10 give predictions up to and including the second six-month period of 2015. The numbers given in these tables are the median, 95th- and the 99th percentile of the number of deaths.
The number of deaths perpetrated by US-led coalition with Iraqi forces is generally higher than that perpetrated by US-led coalition only. The number of deaths predicted for Salah al-Din appears unusually high.
5 Conclusions
We have provided a statistical analysis of the number of civilians deaths from the Iraq conflict. Some of the main conclusions are: (i) the discrete Birnbaum Saunders distribution gives the best possible fit to the number of deaths, (ii) the distribution of the number of deaths differs significantly among the 18 provinces, (iii) the distribution of the number of deaths differs significantly among the four causes, (iv) the number of deaths in Baghdad, Anbar, Babylon, Basrah, Diyala, Ninewa, Tameem and Thi-Qar shows a decreasing trend, (v) the number of deaths in Dahuk and Salah al-Din shows an increasing trend, (vi) the number of deaths in Erbil, Missan, Qadissiya and Sulaymaniyah does not appear to show significant changes, (vii) the number of deaths due to explosives, air attacks and gunfire shows a decreasing trend, and (viii) the number of deaths due to suicide attacks does not appear to show significant changes.
References
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974)
Fisher, R.A., Corbet, A.S., Williams, C.B.: The relation between the number of species and the number of individuals in a random sample of an animal population. J. Anim. Ecol. 12, 42–58 (1943)
Gómez-Déni, E., Vázquez-Polo, F.J., García-García, V.: A discrete version of the half-normal distribution and its generalization with applications. Stat. Pap. (2012). doi:10.1007/s00362-012-0494-6
Jazi, M.A., Lai, C.D., Alamatsaz, M.H.: A discrete inverse Weibull distribution and estimation of its parameters. Stat. Methodol. 7, 121–132 (2010)
Johnson, N.L., Kotz, S., Kemp, A.W.: Univariate Discrete Distributions, 2nd edn. Wiley, New York (1992)
Krishna, H., Pundir, P.S.: Discrete Burr and discrete Pareto distributions. Stat. Methodol. 6, 177–188 (2009)
Lewis, E., Mohler, G., Brantingham, P.J., Bertozzi, A.L.: Self-exciting point process models of civilian deaths in Iraq. Secur. J. 25, 244–264 (2012)
Nakagawa, T., Osaki, S.: The discrete Weibull distribution. IEEE Trans. Reliab. 24, 300–301 (1975)
Nason, G.P., Bailey, D.: Estimating the intensity of conflict in Iraq. J. R. Stat. Soc. A 171, 899–914 (2008)
Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Yang, Z.: Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306–314 (1994)
Yule, G.U.: A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F. R. S. Philos. Trans. R. Soc. B 213, 21–87 (1925)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nadarajah, S. A statistical analysis of Iraq body counts. Qual Quant 49, 21–37 (2015). https://doi.org/10.1007/s11135-013-9971-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-013-9971-9