Abstract
Although it is common in community psychology research to have data at both the community, or cluster, and individual level, the analysis of such clustered data often presents difficulties for many researchers. Since the individuals within the cluster cannot be assumed to be independent, the use of many traditional statistical techniques that assumes independence of observations is problematic. Further, there is often interest in assessing the degree of dependence in the data resulting from the clustering of individuals within communities. In this paper, a random-effects regression model is described for analysis of clustered data. Unlike ordinary regression analysis of clustered data, random-effects regression models do not assume that each observation is independent, but do assume data within clusters are dependent to some degree. The degree of this dependency is estimated along with estimates of the usual model parameters, thus adjusting these effects for the dependency resulting from the clustering of the data. Models are described for both continuous and dichotomous outcome variables, and available statistical software for these models is discussed. An analysis of a data set where individuals are clustered within firms is used to illustrate fetatures of random-effects regression analysis, relative to both individual-level analysis which ignores the clustering of the data, and cluster-level analysis which aggregates the individual data.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agresti, A. (1990).Categorical data analysis. New York: Wiley.
Aitkin, M., & Longford, N. (1986). Statistical modelling issues in school effectiveness studies (with discussion).Journal of the Royal Statistical Society, Series A, 149, 1–43.
Anderson, D., & Aitken, M. (1985). Variance component models with binary response: Interviewer variability.Journal of the Royal Statistical Society, Series B, 47, 203–210.
Barker, R. G. (1968).Ecological psychology: Concepts and methods for studying the environment of human behavior. Stanford, CA: Stanford University Press.
Bock, R. D. (1983). The discrete Bayesian. In H. Wainer & S. Messick (Eds.),Modern advances in psychometric research (pp. 103–115). Hillsdale, NJ: Erlbaum.
Bock, R. D. (1989). Measurement of human variation: A two stage model. In R. D. Bock (Ed.),Multilevel analysis of educational data (pp. 319–342). New York: Academic Press.
Bryk, A. S., & Raudenbush, S. W. (1987). Application of hierarchical linear models to assessing change.Psychological Bulletin, 101 147–158.
Bryk, A. S., & Raudenbush, S. W. (1992).Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage.
Bryk, A. S., Raudenbush, S. W., Seltzer, M., & Congdon, R. (1989).An introduction to HLM: Computer program and users' guide. Chicago: Scientific Software.
Burstein, L. (1980). The analysis of multilevel data in educational research and evaluation. In D. Berliner (Ed.),Review of research in education Vol. 8, pp. 158–233 Washington, DC: American Educational Research Association.
Conaway, M. R. (1989). Analysis of repeated categorical measurements with conditional like-lihood methods.Journal of the American Statistical Association, 84, 53–61.
DeLeeuw, J., & Kreft, I. (1986). Random coefficient models for multilevel analysis.Journal of Educational Statistics, 11, 57–85.
Donner, A. (1982). An empirical study of cluster randomization.International Journal of Epidemiology, 11, 283–286.
Donner, A. (1985). A regression approach to the analysis of data arising from cluster randomization.International Journal of Epidemiology, 14, 322–326.
Ezzet, F., & Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial.Statistics in Medicine, 10, 10, 901–907.
Fielding, J. (1984). Health promotion and disease prevention at the worksite.Annual Review of Public Health 5, 237–265.
Finney, D. J. (1971).Probit analysis (3rd ed.), New York: Cambridge University Press.
Florin, P., Giamartino, G. A., Kenny, D. A., & Wandersman, A. (1990). Levels of analysis and effects: Clarifying group influence and climate by separating individual and group effects.Journal of Applied Social Psychology, 20, 881–900.
Gibbons, R. D., & Bock, R. D. (1987). Trend in correlated proportions.Psychometrika, 52, 113–124.
Gibbons, R. D., & Hedeker, D. (1994). Applicsation of random-effects probit regression models.Journal of Consulting and Clinical Psychology, 62, 285–296.
Gibbons, R. D., Hedeker, D., Elkin, I., Waternaux, C., Kraemer, H. C., Greenhouse, J. B., Shea, M. T., Imber, S. D., Sotsky, S. M., & Watkins, J. T. (1993). Some conceptual and statistical issues in analysis of longitudinal psychiatric data.Archives of General Psychiatry, 50, 739–750.
Gibbons, R. D., Hedeker, D., Waternaux, C., & Davis, J. M. (1988). Random regression models: a comprehensive approach to the analysis of longitudinal psychiatric data.Psychopharmacology Bulletin, 24, 438–443.
Glasgow, R., & Terborg, J. (1988). Occupational health promotion programs to reduce cardiovascular risk.Journal of Consulting and Clinical Psychology, 56, 365–373.
Goldstein, H. (1987).Multilevel models in educational and social research. New York: Oxford University Press.
Goldstein, H. (1991). Nonlinear multilevel models, with an application to discrete response data.Biometrika, 78, 45–51.
Harville, D. A., & Mee, R. W. (1984). A mixed-model procedure for analyzing ordered categorical data.Biometrics, 40, 393–408.
Hedeker, D. (1992a).MIXOR: A Fortran program for mixed-effects ordinal probit and logistic regression. Technical Report, School of Public Health, University of Illinois at Chicago.
Hedeker, D. (1992b).MIXREG: A Fortran program for mixed-effects linear regression with auto-correlated errors. Technical Report, School of Public Health, University of Illinois at Chicago.
Hedeker, D., & Gibbons; R. D. (1994). A random-effects ordinal regression model for multilevel data.Biometrics, 50.
Hedeker, D., Gibbons, R. D., & Davis, J. M. (1991). Random regression models for multicenter clinical trials data.Psychopharmacology Bulletin, 27, 73–77.
Hedeker, D., Gibbons, R. D. & Flay, B. R. (1994). Random-effects regression models for clustered data: With an example from smoking prevention research.Journal of Consulting and Clinical Psychology, 62, 757–765.
Hedeker, D., Gibbons, R. D., Waternaux, C., & Davis, J. M. (1989). Investigating drug plasma levels and clinical response using random regression models.Psychopharmacology Bulletin, 25, 227–231.
Hopkins, K. D. (1982). The unit of analysis: Group means versus individual observations.American Educational Research Journal, 19, 5–18.
Jacobs, D. R., Jeffery, R. W., & Hannan, P. J. (1989). Methodological issues in worksite health intervention research: II. Computation of variance in worksite data: Unit of analysis. In K. Johnson, J. H. LaRosa, C. J. Scheirer, et al. (Eds.)Proceedings of the 1988 methodological issues in worksite research conference (pp. 77–88). Airlie, VA: United States Department of Health and Human Services.
Jasnen, J. (1990). On the statistical analysis of ordinal data when extravariation is present.Applied Statistics, 39, 75–84.
Jason, L., Salina, D., Hedeker, D., Kimball, P., Kaufman, J., Bennett, P., Bernstein, R., & Lesondak, L. (1991). Designing an effective worksite smoking cessation program using self-help manuals, incentives, groups and media.Journal of Business and Psychology, 6, 155–166.
Jennrich, R. I., & Sampson, P. F. (1988). 3V: General mixed model analysis of variance. In W. J. Dixon (Chief Ed.),BMDP statistical software manual (Vol. 2, pp. 1025–1043). Berkeley: University of California Press.
Kelly, J. G. (1966): Ecological constraints on mental health services.American Psychologist, 21, 535–539.
Kenny, D. A., & La Voie, L. (1985). Separating individual and group effects.Journal of Personality and Social Psychology, 48, 339–348.
Kish, L. (1965),Survey sampling. New York: Wiley.
Klesges, R. C., Cigrang, J., & Glasgow, R. E. (1987). Worksite smoking modification programs: A state-of-the-art review and directions for future research.Current Psychological Research & Reviews, 6(1), 26–56.
Koepke, D., & Flay, B. R. (1989). Levels of analysis In M. T. Braverman (Ed.),Evaluating health promotion programs. New directions for program evaluation (No. 43, pp. 75–87). San Francisco: Jossey-Bass.
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data.Biometrics, 38, 963–974.
Levine, M., & Perkins, D. V. (1987).Principles of community psychology: Perspectives and applications. New York: Oxford University Press.
Linney, J. A., & Reppucci, N. D. (1982). Research design and methods in community psychology. In P. C. Kendall & J. N. Butcher (Eds),Handbook of research methods in clinical psychology (pp. 535–566). New York: Wiley.
Longford, N. T. (1986). VARCL-Interactive software for variance component analysis.Professional Statistician, 74, 817–827.
Longford, N. T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects.Biometrika, 74, 817–827.
McKinlay, S. M., Stone, E. J., & Zucker, D. M. (1989). Research design and analysis issues.Health Education Quarterly, 16, 307–313.
Moos, R. H. (1976).The human context: Environmental determinants of behavior. New York: Wiley.
Murray, D. M., Hannan, P. J., & Zucker, D. M. (1989). Analysis issues in school-based health promotion studies.Health Education Quarterly, 16, 315–320.
Prosser, R., Rasbash, J., & Goldstein, H. (1991).ML3 software for three-level analysis, users' guide for v.2. London: Institute of Education, University of London.
Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects.Sociology of Education, 59, 1–17.
Raudenbush, S. W., & Bryk, A. S. (1988–89). Methodological advances in analyzing the effects of schools and classrooms on student learning. In E. Z. Rothkopf (Ed.).Review of research in education (Vol. 15, pp 423–475). Washington, DC: American Educational Research Association.
Sarason, S. B. (1972).The culture of the school and the problem of change. Boston: Allyn & Bacon.
Schwartz, J. L. (1987, April).Review and evaluation of smoking cessation methods: The United States and Canada. 1975–1985. (DHHS No. 87-2940). Washington, DC: National Cancer Institute.
Searle, S. R. (1987).Linear models for unbalanced data. New York: Wiley.
Shinn, M. (1990). Mixing and matching: Levels of conceptualization, measurement, and statistical analysis in community research. In P. Tolan, C. Keys, F. Chertok, & L. Jason (Eds.),Researching community psychology: Issues of theory and methods (pp. 111–126). Washington, DC: American Psychological Association.
Stirateli, R., Laird, N. M., & Ware, J. H. (1984). Random-effects models for serial observations with binary response.Biometrics, 40, 961–971.
Sorensen, G., Pechacek, & Pallonen, U. (1986). Occupational and worksite norms and attitudes about smoking cessation.American Journal of Public Health, 76, 544–549.
Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large.Transactions of the American Mathematical Society, 54, 426–482.
Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses.Annals of Mathematical Statistics, 9, 60–62.
Wong, G. Y., & Mason, W. M. (1985). The hierarchical logistic regression model for multilevel analysis.Journal of the American Statistical Association, 80, 513–524.
Zeger, S. L., Liang, K-Y., & Self, S. G. (1985). The analysis of binary longitudinal data with time independent covariates.Biometrika, 72, 31–38.
Author information
Authors and Affiliations
Additional information
Preparation of this article was supported by National Heart, Lung, and Blood Institute Grant R18 HL42987-01A1, National Institutes of Mental Health Grant MH44826-01A2, and University of Illinois at Chicago Prevention Research Center Developmental Project CDC Grant R48/CCR505025.
Rights and permissions
About this article
Cite this article
Hedeker, D., McMahon, S.D., Jason, L.A. et al. Analysis of clustered data in community psychology: With an example from a worksite smoking cessation project. Am J Commun Psychol 22, 595–615 (1994). https://doi.org/10.1007/BF02506895
Issue Date:
DOI: https://doi.org/10.1007/BF02506895