Abstract
Scientists often use randomized controlled trials to compare a newly developed treatment to the existing one, or to a placebo. Patients are randomly assigned to a treatment, and they are compared with respect to the outcome of interest. The cluster randomized trial (CRT) is a type of randomized controlled trial in which the treatments are randomized at the group, rather than individual, level. The intracluster correlation (ICC) measures the degree of similarity between individuals within clusters. CRTs can be designed in several ways; it is essential that researchers carefully plan the study, from sample size calculations to ICC calculation to analysis, in order to get valid and meaningful results. In this article we review and discuss the considerations essential to conducting a successful CRT using both frequentist and Bayesian approaches, and we discuss recent trends in CRT analysis, including highlighting new methodology for both binary and continuous data.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Adams, G., M. C. Gulliford, O.C. Ukoumunne, S. Eldridge, S. Chinn, and M. J. Campbell. 2004. Patterns of intra-cluster correlation from primary care research to inform study design and analysis. J. Clin. Epidemiol., 57(8), 785–94.
Ahmed, M., and M. Shoukri. 2010. A Bayesian estimator of the intracluster correlation coefficient from correlated binary responses. J. Data Sci., 8, 127–137.
Althabe, F., P. Buekens, E. Bergel, J. M. Belizán, N. Kropp, L. Wright, et al. 2005. A cluster randomized controlled trial of a behavioral intervention to facilitate the development and implementation of clinical practice guidelines in Latin American maternity hospitals: The Guidelines Trial: Study protocol. BMC Womens Health, 5(1), 4.
Altham, P. M. E. 1978. Two generalizations of the binomial distribution. J. R. Stat. Soc. Ser. C, 27(2), 162–167.
Bansal, N. K., and M. K. Bhandary. 2000. Bayes estimation of intraclass correlation coefficient. Commun. Stat. Theory Methods, 29(1), 79–93.
Bansal, N. K., M. Bhandary, and K. Fujiwara. 2013. Bayes estimation of intraclass correlation coefficients under unequal family sizes. Commun. Stat. Simul. Comput., 42(2), 294–302.
Baskerville, N. B., W. Hogg, and J. Lemelin. 2001. The effect of cluster randomization on sample size in prevention research. J. Family Pract., 50(3), 242.
Bland, J. M. 2004. Cluster randomised trials in the medical literature: Two bibliometric surveys. BMC Med. Res. Methodol., 13(4), 21.
Bryk, A., and S. Raudenbush. 1992. Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage.
Cals, J. W. L., C. C. Butler, R. M. Hopstaken, K. Hood, and G.-J. Dinant. 2009. Effect of point of care testing for C reactive protein and training in communication skills on antibiotic use in lower respiratory tract infections: Cluster randomised trial. Br. Med J., 338, b1374.
Campbell, M. J., A. Donner, and N. Klar. 2007. Developments in cluster randomized trials and Statistics in Medicine. Stat. Med., 26, 2–19.
Campbell, M. K., J. M. Grimshaw, and D. R. Elbourne. 2004. Intracluster correlation coefficients in cluster randomized trials: Empirical insights into how should they be reported. BMC Med. Res. Methodol., 4(9), 702–708.
Carlo, W. A., S. S. Goudar, I. Jehan, E. Chomba, A. Tshefu, A. Garces, et al. 2010. Newborn-care training and perinatal mortality in developing countries. N. Engl. J. Med., 362(7), 614–623.
Carrol, R., and D. Ruppert. 1988. Transformation and weighting in regression. London, UK: Chapman and Hall.
Chakraborty, H. 2008a. Cluster-randomized trial of a community-based intervention. Lancet, 372(9649), 1541.
Chakraborty, H. 2008b. The design and analysis aspects of cluster randomized trials. In Statistical advances in the biomedical sciences: Clinical trials, epidemiology, survival analysis, and bioinformatics, ed. A. Biswas, S. Datta, J. Fine, and M. Segal, 67–75. New York, NY: John Wiley and Sons.
Chakraborty, H., J. Moore, W. A. Carlo, T. D. Hartwell, and L. L. Wright. 2009. A simulation based technique to estimate intracluster correlation for a binary variable. Contemp. Clin. Trials, 30(1), 71–80.
Chakraborty, H., J. Moore, and T. D. Hartwell. 2009. Intracluster correlation adjustments to maintain power in cluster trials for binary outcomes. Contemp. Clin. Trials, 30(5), 473–480.
Chakraborty, H., and P. K. Sen. 2013. Resampling method to estimate intracluster correlation for clustered binary data. Commun. Stat. Theory Methods, (forthcoming).
Chuang, J.-H., G. Hripcsak, and D. Heitjan. 2002. Design and analysis of controlled trials in naturally clustered environments. J. Am. Med. Inform. Assoc., 9(3), 230–239.
Cook, J., T. Bruckner, and G. S. MacLennan. 2012. Clustering in surgical trials—Database of intracluster correlations. Trials, 13(2), 1–8.
Cornfield, J. 1978. Randomization by group: A formal analysis. Am. J. Epidemiol., 108, 100–102.
DerSimonian, R., and N. Laird. 1986. Meta-analysis in clinical trials. Control Clin. Trials, 7(3), 177–88.
Divine, G. W., J. T. Brown, and L. M. Frazier. 1992. The unit of analysis error in studies about physicians’ patient care behavior. J. Gen. Intern. Med., 7(6): 623–629.
Donner, A. 1979. The use of correlation and regression in the analysis of family resemblance. Am. J. Epidemiol., 1979;110, 335–342.
Donner, A. 1986. A review of inference procedures for the intraclass correlation coefficient in the one-way random effects model. Int. Stat. Rev., 54(1), 67–82.
Donner, A. 1998. Some aspects of the design and analysis of cluster randomization trials. J. R. Stat. Soc. Ser. C (Appl. Stat.), 47(1), 95–113.
Donner, A., N. Birkett, and C. Buck. 1981. Randomization by cluster: Sample size requirements and analysis. Am. J. Epidemiol., 114(6), 906–914.
Donner, A., K. Brown, and P. Brasher. 1990. A methodological review of non-therapeutic intervention trials employing cluster randomization, 1979–1989. Int. J. Epidemiol., 19, 795–800.
Donner, A., and A. Donald. 1987. Analysis of data arising from a stratified design with the cluster as unit of randomization. Stat. Med., 6, 43–52.
Donner, A., and N. Klar. 1994a. Cluster randomisation trials in epidemiology: Theory and application. J. Stat. Plan. Inference, 42, 37–56.
Donner, A., and N. Klar. 1994b. Methods for comparing event rates in intervention studies when the unit of allocation is a cluster. Am. J. Epidemiol., 140, 279–289.
Donner, A., and N. Klar. 1996. Statistical considerations in the design and analysis of community intervention trials. J. Clin. Epidemiol., 49(4), 435–439.
Donner, A., and N. Klar. 2000. Design and analysis of cluster randomisation trials in health research. London, UK: Hodder Arnold.
Donner, A., and N. Klar. 2004. Pitfalls of and controversies in cluster randomization trials. Am. J. Public Health, 94(3), 416–422.
Donner, A., and J. J. Koval. 1980. The estimation of intraclass correlation in the analysis of family data. Biometrics, 36(1), 19–25.
Donner, A., and J. J. Koval. 1983. A note on the accuracy of Fisher’s approximation to the large-sample variance of an intraclass correlation. Commun. Stat. Simul. Comput., 12, 443–449.
Donner, A., and G. Wells. 2013. A comparison of confidence interval methods for the intraclass correlation coefficient. Biometrics, 42(2), 401–412.
Eldridge, S., and S. Kerry. 2012. A practical guide to cluster randomised trials in health services research. New York, NY: John Wiley and Sons.
Eldridge, S. M., D. Ashby, G. S. Feder, A. R. Rudnicka, and O. C. Ukoumunne. 2004. Lessons for cluster randomized trials in the twenty-first century: A systematic review of trials in primary care. Clin. Trials, 1(1), 80–90.
Ennett, S. T., N. S. Tobler, C. L. Ringwalt, and R. L. Flewelling. 1994. How effective is drug abuse resistance education? A meta-analysis of Project DARE outcome evaluations. Am. J. Public Health, 84(9), 1394–401.
Evans, B. A., Z. Feng, and A. V. Peterson. 2001. A comparison of generalized linear mixed model procedures with estimating equations for variance and covariance parameter estimation in longitudinal studies and group randomized trials. Stat. Med., 20, 3353–3373.
Feng, Z., P. Diehr, Y. Yasui, B. Evans, S. Beresford, and T. D. Koepsell. 1999. Explaining community-level variance in group randomized trials. Stat. Med., 18(5), 539–56.
Fisher, R. 1925. Statistical methods for research workers. Edinburgh, UK: Oliver and Boyd.
Fleiss, J. L. 1986. Reliability of measurement. In The design and analysis of clinical experiments, 1–32. New York, NY: John Wiley and Sons.
Fleiss, J. L., and J. Cuzick. 1979. The reliability of dichotomous judgments: Unequal numbers of judges per subject. Appl. Psychol. Meas., 3(4), 537–542.
Gail, M., D. Byar, T. Pechaceck, and D. Corle. 1992. Aspects of statistical design for the community intervention trial for smoking cessation (COMMIT). Control Clin. Trials, 13, 6–21.
Gilbody, S., P. Bower, D. Torgerson, and D. Richards. 2008. Cluster randomized trials produced similar results to individually randomized trials in a meta-analysis of enhanced care for depression. J. Clin. Epidemiol., 61(2), 160–168.
Goldstein, H. 1995. Multilevel statistical models. New York, NY: Edward Arnold; Halstead Press.
Gulliford, M. C., G. Adams, O. C. Ukoumunne, R. Latinovic, S. Chinn, and M. J. Campbell. 2005. Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. J. Clin. Epidemiol., 58(3), 246–251.
Hade, E. M., D. M. Murray, M. L. Pennell, D. Rhoda, E. D. Paskett, V. L. Champion, et al. 2010. Intraclass correlation estimates for cancer screening outcomes: Estimates and applications in the design of group-randomized cancer screening studies. J. Natl. Cancer Inst. Monogr., 2010(40), 97–103.
Halloran, M. E., C. J. Struchiner, and I. M. Longini. 1997. Study designs for evaluating different efficacy and effectiveness aspects of vaccines. Am. J. Epidemiol., 146(10), 789–803.
Handlos, L., H. Chakraborty, and P. K. Sen. 2009. Evaluation of cluster randomized trials on maternal and child health research in developing countries. Trop. Med. Int. Health, 14(8), 947–956.
Hayes, R. J., N. D. E. Alexander, S. Bennett, and S. N. Cousens. 2000. Design and analysis issues in cluster-randomized trials of interventions against infectious diseases. Stat. Methods Med. Res., 9, 95–116.
Hayes, R. J., and S. Bennett. 1999. Simple sample size calculation for cluster-randomized trials. Int. J. Epidemiol. 28(2): 319–26.
Hedges, L. V., and E. C. Hedberg. 2007. Intraclass correlation values for planning group-randomized trials in education. Educ. Eval. Policy Anal., 29(1), 60–87.
Hereren, T., and R. D’Agostino. 1987. Robustness of the two independent samples t-test when applied to ordinal scaled data. Stat. Med., 6, 79–90.
Hewitt, C. E., D. J. Torgerson, and J. N. Miles. 2008. Individual allocation had an advantage over cluster randomization in statistical efficiency in some circumstances. J. Clin. Epidemiol., 61(10), 1004–1008.
Huber, P. 1981. Robust statistics. New York, NY: Wiley.
Isaakidis, P., and J. P. A. Ioannidis. 2003. Evaluation of cluster randomized controlled trials in Sub-Saharan Africa. Am. J. Epidemiol., 158(9), 921–926.
Karlin, S., E. Cameron, and P. Williams. 1981. Sibling and parent-offspring correlation estimation with variable family size. Proc. Natl. Acad. Sci. USA, 78(5), 2664–2668.
Kernan, W. N., C. M. Viscoli, R. W. Makuch, L. M. Brass, and R. I. Horwitz. 1999. Stratified randomization for clinical trials. J. Clin. Epidemiol., 52(1), 19–26.
Kerry, S. M., and J. M. Bland. 1998. Trials which randomize practices 1: How should they be analysed? Family Pract., 15(1), 80–83.
Killip, S., Z. Mahfoud, and K. Pearce. 2004. What is an intracluster correlation coefficient? Crucial concepts for primary care researchers. Ann. Family Med., 2(3), 204–208.
Kleinman, J. C. 1973. Proportions with extraneous variance: Single and independent samples. J. Am. Stat. Assoc., 68(341), 46–54.
Kupper, L. L., and J. K. Haseman. 1978. The use of a correlated binomial model for the analysis of certain toxicological experiments. Biometrics, 34(1), 69–76.
Lee, E., and N. Durbin. 1994. Estimation and sample size considerations for clustered binary responses. Stat. Med., 13, 1241–1252.
Liang, K. Y., and S. L. Zeger. 1986. Longitudinal data analysis using generalized linear models. Biometrika, 73(1), 13–22.
Lipsitz, S. R., N. M. Laird, and T. A. Brennan. 1994. Simple moment estimates of the k-coefficient and its variance. J. R. Stat. Soc. Ser. C, 43(2), 309–323.
MacLennan, G., C. Ramsay, J. Mollison, K. Campbell, J. Grimshaw, and R. Thomas. 2003. Room for improvement in the reporting ofcluster randomised trials in behaviour change research. New York, NY: Elsevier Science.
Mak, T. K. 1988. Analysing Intraclass Correlation for Dichotomous Variables. J. R. Stat. Soc. Ser. C, 37(3), 344–352.
Maritz, T., and R. Jarrett. 1983. The use of statistics to examine the assiciation between fluoride in drinking water and cancer death rates. Appl. Stat., 32(2), 97–101.
McCullagh, P., and J. A. Nelder. 1989. Generalized linear models, 2nd ed. London, UK: Chapman and Hall.
McKinlay, S. M., E. J. Stone, and D. M. Zucker. 1989. Research design and analysis issues. Health Educ. Behav., 16(2), 307–313.
Murray, D. M., S. P. Varnell, and J. L. Blitstein. 2004. Design and analysis of group-randomized trials: A review of recent methodological developments. Am. J. Public Health, 94(3), 423–32.
Nelder, J. A. 1987. Pregibon D. An extended quasi-likelihood function. Biometrika, 74(2), 221–232.
Ochi, Y., and R. L. Prentice. 1984. Likelihood inference in a correlated probit regression model. Biometrika, 71(3), 531–543.
Online intraclass correlation database. n.d. http://stateva.ci.northwestern.edu
Prentice, R. L. 1986. Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. J. Am. Stat. Assoc., 81(394), 321–327.
Puffer, S., D. J. Torgerson, and J. Watson. 2003. Evidence for risk of bias in cluster randomised trials: Review of recent trials published in three general medical journals. Br. Med. J., 327, 1–5.
Raudenbush, S. W. 1997. Statistical analysis and optimal design for cluster randomized trials. Psychol. Methods, 2(2), 173–185.
Reading, R., I. Harvey, and M. Mclean. 2000. Cluster randomised trials in maternal and child health: Implications for power and sample size. Arch. Dis. Child., 82(1), 79–83.
Ridout, M. S., C. G. B. Demetrio, and D. Firth. 1999. Estimating intraclass correlation for binary data. Biometrics, 55(March), 137–148.
Rogers W. 1993. Regression standard errors in clustered samples. Stata Tech. Bull., 13, 19–23.
Schulz, K. F. 1995. Subverting randomization in controlled trials. J. Am. Med. Assoc., 274(18), 1456–1458.
Simpson, J. M., N. Klar, and A. Donner. 1995. Accounting for cluster randomization: A review of primary prevention trials, 1990 through 1993. Am. J. Public Health, 85(10), 1378–1383.
Smith, C. 1956. On the estimation of intraclass correlation. Ann. Hum. Genet., 21, 363–373.
Smith, C. 1980a. Estimating genetic correlations. Ann. Hum. Genet., 44, 265–284.
Smith, C. 1980b. Further Remarks on estimating genetic correlations. Ann. Hum. Genet., 44, 95–105.
Spiegelhalter, D. J. 2001. Bayesian methods for cluster randomized trials with continuous responses. Stat. Med., 20(3), 435–52.
Srivastava, M. S. 1984. Estimation of interclass correlations in familial data. Biometrika, 71(1), 177.
Stiratelli, R., N. Laird, and J. H. Ware. 1984. Random-effects models for serial observations with binary response. Biometrics, 40(4), 961–971.
Swiger, L. A., W. R. Harvey, D. O. Everson, and K. E. Gregory. 1964. The variance of intraclass correlation involving groups with one observation. Biometrics, 20(4), 818–826.
Tamura, R. N., and S. S. Young. 1987. A stabilized moment estimator for the beta-binomial distribution. Biometrics, 43(4), 813–824.
Thabane, L., L. Mbuagbaw, S. Zhang, Z. Samaan, M. Marcucci, C. Ye, et al. 2013. A tutorial on sensitivity analyses in clinical trials: The what, why, when and how. BMC Med. Res. Methodol., 13(1), 92.
Thomas, J., and R. Hultquist. 1978. Interval estimation for the unbalanced case of the one-way random effects model. Ann. Stat., 6, 582–587.
Thompson, S. G., S. D. Pyke, and R. J. Hardy. 1997. The design and analysis of paired cluster randomized trials: An application of meta-analysis techniques. Stat. Med., 16(18), 2063–2079.
Turner, R. M., R. Z. Omar, and S. G. Thompson. 2001. Bayesian methods of analysis for cluster randomized trials with binary outcome data. Stat. Med., 20(3), 453–72.
Turner, R., A. Prevost, and S. G. Thompson. 2004. Allowing for imprecision of the intracluster correlation coefficient in the design of cluster randomized trials. Stat. Med., 23, 1195–1214.
Ukoumunne, O. C., M. C. Gulliford, S. Chinn, J. Sterne, and P. Burney. 1999. Methods for evaluating area-wide and organization-based interventions in health and health care: A systematic review. Health Technol. Assess., 3(5), iii–92.
Ware, J. H. 1985. Linear models for the analysis of longitudinal studies. Am. Stat., 39(2), 95–101.
White, H. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(4), 817–838.
Whiting-O’Keefe, Q. E., C. Henke, and D. W. Simborg. 1984. Choosing the correct unit of analysis in medical care experiments. Med. Care. 22(12), 1101–1114.
Williams, D. A. 1982. Extra-binomial variation in logistic linear models. J. R. Stat. Soc. Ser. C, 31(2), 144–148.
Wolfinger, R., and M. O’Connell. 1993. Generalized linear mixed models: A pseudo-likelihood approach. J. Stat. Comput. Simul., 48, 233–243.
Wu, S., C. M. Crespi, and W. K. Wong. 2012. Comparison of methods for estimating the intraclass correlation coefficient for binary responses in cancer prevention cluster randomized trials. Contemp. Clin. Trials, 33(5), 869–880.
Yamamoto, E., and T. Yanagimoto. 1992. Moment estimators for the binomial distribution. J. Appl. Stat., 19, 273–283.
Zeger, S., and K. Liang. 1992. An overview of methods for the analysis of longitudinal data. Stat. Med., 11, 825–839.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chakraborty, H., Lyons, G. Cluster Randomized Trials: Considerations for Design and Analysis. J Stat Theory Pract 9, 685–698 (2015). https://doi.org/10.1080/15598608.2014.992081
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2014.992081