Skip to main content

Understanding the Complexities of Experimental Analysis in the Context of Higher Education

  • Living reference work entry
  • First Online:
Higher Education: Handbook of Theory and Research

Part of the book series: Higher Education: Handbook of Theory and Research ((HATR,volume 36))

  • 65 Accesses


The most reliable method for identifying the causal effect of a treatment on an outcome is to conduct an experiment in which the treatment is randomly assigned to a portion of the sample. The causal effect of the treatment is the difference in outcomes between units exposed to the treatment and those that were not. The apparent simplicity of randomized control trials (RCTs) belies the true complexity in designing, conducting, and analyzing them. This chapter provides an introduction to experimental analysis in higher education research that unpacks the multiple complexities inherent in RCTs. It presents both the basic logic and mathematics of experimental analysis before explaining more complex design elements such as blocking, clustering, and power analysis. The chapter also discusses issues that can undermine experiments such as attrition, treatment fidelity, and contamination and suggests methods of mitigating their negative effects. Concrete examples from experimental analyses in the higher education literature are provided throughout the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others


  1. 1.

    Somewhat confusingly, balance in an experimental context can also refer to the proportion of units assigned to treatment and control. A balanced experiment is one in which 50% of units are assigned treatment and 50% are assigned control. I will use the term in this chapter to mean that individual characteristics are, on average, equivalent across treatment arms, what is more specifically referred to as covariate balance.

  2. 2.

    Some sources use the term stratification instead of blocking, but I think it is wise to use separate terms in an effort to distinguish between random sampling and random assignment to treatment.

  3. 3.

    There is debate in the literature around how many clusters are too few. Duflo et al. (2008) suggest fewer than 50 is problematic, while Hayes and Moulton (2017) suggest there need to be at least 15 clusters per treatment arm.


  • Angelucci, M., & Di Maro, V. (2015). Program evaluation and spillover effects. IZA Discussion Paper No. 9033.

  • Angrist, J. D., & Pischke, J. (2009). Mostly harmless econometrics: An empiricist’s companion. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91(434), 444–455.

    Article  Google Scholar 

  • Aronow, P. M. (2012). A general method for detecting interference between units in randomized experiments. Sociological Methods & Research, 41(1), 3–16.

    Article  Google Scholar 

  • Athey, S., & Imbens, G. (2017). The econometrics of randomized experiments. In A. V. Banerjee & E. Duflo (Eds.), Handbook of field experiments (pp. 73–140). Amsterdam: North Holland.

    Chapter  Google Scholar 

  • Baird, S., Bohren, A., McIntosh, C., & Özler, B. (2014). Designing experiments to measure spillover effects. World Bank Policy Research Working Paper No. 6824.

  • Baker, R. B., Evans, B. J., Li, Q., & Cung, B. (2019). Does inducing students to schedule lecture watching in online classes improve their academic performance? An experimental analysis of a time management intervention. Research in Higher Education, 60, 521–552.

    Article  Google Scholar 

  • Barnard, J., Frangakis, C. E., Hill, J. L., & Rubin, D. B. (2003). Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York City. Journal of the American Statistical Association, 98(462), 299–323.

    Google Scholar 

  • Barnett, E. A., Bergman, P., Kopko, E., Reddy, V., Belfield, C. R., & Roy, S. (2018). Multiple measures placement using data analtyics: An implmentation and early impacts report. Center for the Analysis of Postsecondary Readiness.

    Google Scholar 

  • Barrow, L., Richburg-Hayes, L., Rouse, C. E., & Brock, T. (2014). Paying for performance: The education impacts of a community college scholarship program for low-income adults. Journal of Labor Economics, 32(3), 563–599.

    Article  Google Scholar 

  • Bastedo, M. N., & Bowman, N. A. (2017). Improving admission of low-SES students at selective colleges: Results from an experimental simulation. Educational Researcher, 47(2), 67–77.

    Article  Google Scholar 

  • Bell, M. L., Kenward, M. G., Fairclough, D. L., & Horton, N. J. (2013). Differential dropout and bias in randomised controlled trials: When it matters and when it may not. BMJ, 346:e8668, 1–7.

    Google Scholar 

  • Benjamin-Chung, J., Arnold, B. F., Berger, D., Luby, S. P., Miguel, E., Colford, J. M., Jr., & Hubbard, A. E. (2018). Spillover effects in epidemiology: Parameters, study designs and methodological considerations. International Journal of Epidemiology, 47(1), 332–347.

    Article  Google Scholar 

  • Bettinger, E. P., & Baker, R. B. (2014). The effects of student coaching: An evaluation of a randomized experiment in student advising. Educational Evaluation and Policy Analysis, 36(1), 3–19.

    Article  Google Scholar 

  • Bettinger, E. P., & Evans, B. J. (2019). College guidance for all: A randomized experiment in pre-college advising. Journal of Policy Analysis and Management, 38(3), 579–599.

    Article  Google Scholar 

  • Bloom, H. S. (1995). Minimum detectable effects: A simple way to report the statistical power of experimental designs. Evaluation Review, 19(5), 547–556.

    Article  Google Scholar 

  • Bloom, H. S. (2005). Randomizing groups to evaluate place-based programs. In H. S. Bloom (Ed.), Learning more from social experiments: Evolving analytic approaches (pp. 115–172). New York: Russell Sage Foundation.

    Google Scholar 

  • Bloom, H. S. (2008). The core analytics of randomized experiments for social research. In P. Alasuutari, L. Bickman, & J. Branned (Eds.), The SAGE handbook of social research methods (pp. 115–133). Los Angeles: SAGE.

    Chapter  Google Scholar 

  • Bloom, H. S., Richburg-Hayes, L., & Black, A. R. (2007). Using covariates to improve precision for studies that randomize schools to evaluate educational interventions. Educational Evaluation and Policy Analysis, 29(1), 30–59.

    Article  Google Scholar 

  • Bloom, H. S., Raudenbush, S. W., Weiss, M. J., & Porter, K. (2017). Using multisite experiments to study cross-site variation in treatment effects: A hybrid approach with fixed intercepts and a random treatment coefficient. Journal of Research on Educational Effectiveness, 10(4), 817–842.

    Article  Google Scholar 

  • Borm, G. F., Melis, R. J. F., Teerenstra, S., & Peer, P. G. (2005). Pseudo cluster randomization: A treatment allocation method to minimize contamination and selection bias. Statistics in Medicine, 24(23), 3535–3547.

    Article  Google Scholar 

  • Botelho, A., & Pinto, L. C. (2004). Students’ expectations of the economic returns to college education: Results of a controlled experiment. Economics of Education Review, 23, 645–653.

    Google Scholar 

  • Bowen, W. G., Chingos, M. M., Lack, K. A., & Nygren, T. I. (2013). Interactive learning online at public universities: Evidence from a six-campus randomized trial. Journal of Policy Analysis and Management, 33(1), 94–111.

    Article  Google Scholar 

  • Castleman, B. L., & Page, L. C. (2015). Summer nudging: Can personalized text messages and peer mentor outreach increase college going among low-income high school graduates? Journal of Economic Behavior & Organization, 115, 144–160.

    Article  Google Scholar 

  • Castleman, B. L., Arnold, K., & Wartman, K. L. (2012). Stemming the tide of summer melt: An experimental study of the effects of post-high school summer intervention on low-income students’ college enrollment. Journal of Research on Educational Effectiveness, 5(1), 1–17.

    Article  Google Scholar 

  • Castleman, B. L., Page, L. C., & Schooley, K. (2014). The forgotten summer: Does the offer of college counseling after high school mitigate summer melt among college-intending, low-income high school graduates? Journal of Policy Analysis & Management, 33(2), 320–344.

    Article  Google Scholar 

  • Cheng, A., & Peterson, P. E. (2019). Experimental estimates of impacts of cost-earnings information on adult aspirations for children’s postsecondary education. Journal of Higher Education, 90(3), 486–511.

    Article  Google Scholar 

  • Ciolino, J. D., Martin, R. H., Zhao, W., Hill, M. D., Jauch, E. C., & Palesch, Y. Y. (2015). Measuring continuous baseline covariate imbalances in clinical trial data. Statistical Methods in Medical Research, 24(2), 255–272.

    Article  Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale: Lawrence Erlbaum Associates, Publishers.

    Google Scholar 

  • Darolia, R., & Harper, C. (2018). Information use and attention deferment in college student loan decisions: Evidence from a debt letter experiment. Educational Evaluation and Policy Analysis, 40(1), 129–150.

    Article  Google Scholar 

  • DeclareDesign. (2018). The trouble with ‘controlling for blocks.’

  • DiNardo, J., McCrary, H., & Sanbonmatsu, L. (2006). Constructive proposals for dealing with attrition: An empirical example. Working Paper.

  • Dong, N., & Maynard, R. (2013). PowerUp!: A tool for calculating minimum detectable effect size and minimum required sample sizes for experimental and quasi-experimental design studies. Journal of Research on Educational Effectiveness, 6(1), 24–67.

    Article  Google Scholar 

  • Donner, A. (1998). Some aspects of the design and analysis of cluster randomization trials. Applied Statistics, 47(1), 95–113.

    Google Scholar 

  • Duflo, E., Glennerster, R., & Kremer, M. (2008). Using randomization in development economics research: A toolkit. In T. Schultz & J. Strauss (Eds.), Handbook of development economics, volume 4 (pp. 3895–3962). Amsterdam: North Holland.

    Google Scholar 

  • Dynarski, S., Libassi, C. J., Michelmore, K., & Owen, S. (2018). Closing the gap: The effect of a targeted tuition-free promise on college choices of high-achieving, low-income students. NBER Working Paper No. 25349.

  • Evans, B. J., & Boatman, A. (2019). Understanding how information affects loan aversion: A randomized control trial of providing federal loan information to high school seniors. Journal of Higher Education, 90(5), 800–832.

    Article  Google Scholar 

  • Evans, B. J., & Henry, G. T. (2020). Self-paced remediation and math placement: A randomized field experiment in a community college. AEFP Working Paper.

  • Evans, B. J., Boatman, A., & Soliz, A. (2019). Framing and labeling effects in preferences for borrowing for college: An experimental analysis. Research in Higher Education, 60(4), 438–457.

    Article  Google Scholar 

  • Faul, E., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191.

    Article  Google Scholar 

  • Faul, E., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160.

    Article  Google Scholar 

  • Field, E. (2009). Educational debt burden and career choice: Evidence from a financial aid experiment at NYU law school. American Economic Journal: Applied Economics, 1, 1), 1–1),21.

    Google Scholar 

  • Firpo, S., Foguel, M. N., & Jales, H. (2020). Balancing tests in stratified randomized controlled trials: A cautionary note. Economics Letters, 186, 1–4.

    Article  Google Scholar 

  • Fisher, R. A. (1935). The design of experiments. Edinburgh: Oliver and Boyd.

    Google Scholar 

  • Frangakis, C. E., & Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics, 58(1), 21–29.

    Google Scholar 

  • Freedman, D. A. (2008). On regression adjustments to experimental data. Advances in Applied Mathematics, 40, 180–193.

    Article  Google Scholar 

  • Furquim, F., Corral, D., & Hillman, N. (2019). A primer for interpreting and designing difference-in-differences studies in higher education research. In Higher education: Handbook of theory and research: Volume 35 (pp. 1–58). Cham: Springer.

    Google Scholar 

  • Gehlbach, H., & Robinson, C. D. (2018). Mitigating illusory results through preregistration in education. Journal of Research on Educational Effectiveness, 11(2), 296–315.

    Article  Google Scholar 

  • Gennetian, L. A., Morris, P. A., Bos, J. M., & Bloom, H. S. (2005). Constructing instrumental variables from experimental data to explore how treatments produce effects. In H. S. Bloom (Ed.), Learning more from social experiments: Evolving analytic approaches (pp. 75–114). New York: Russell Sage Foundation.

    Google Scholar 

  • Gerber, A. S., & Green, D. P. (2012). Field experiments: Design, analysis, and interpretation. New York: Norton.

    Google Scholar 

  • Gibbons, C. E., Serrato, J. C. S., & Urbancic, M. B. (2018). Broken or fixed effects? Journal of Econometric Methods, 8(1), 1–12.

    Google Scholar 

  • Goldrick-Rab, S., Kelchen, R., Harris, D. N., & Benson, J. (2016). Reducing income inequality in educational attainment: Experimental evidence on the impact of financial aid on college completion. American Journal of Sociology, 121(6), 1762–1817.

    Article  Google Scholar 

  • Gopalan, M., Rosinger, K., & Ahn, J. B. (2020). Use of quasi-experimental research designs in education research: Growth, promise, and challenges. Review of Research in Education, 44(1), 218–243.

    Article  Google Scholar 

  • Grissmer, D. W. (2016). A guide to incorporating multiple methods in randomized controlled trials to assess intervention effects (2nd ed.). Washington DC: American Psychological Association.

    Google Scholar 

  • Hallberg, K., Wing, C., Wong, V., & Cook, T. D. (2013). Experimental design for causal inference: Clinical trials and regression discontinuity designs. In T. D. Little (Ed.), Oxford handbook of quantitative methods. Volume 1: Foundations (pp. 223–236). New York: Oxford University Press.

    Google Scholar 

  • Hansen, B. B., & Bowers, J. (2008). Covariate balance in simple, stratified and clustered comparative studies. Statistical Science, 23(2), 219–236.

    Article  Google Scholar 

  • Hanson, A. (2017). Do college admissions counselors discriminate? Evidence from a correspond-based field experiment. Economics of Education Review, 60, 86–96.

    Article  Google Scholar 

  • Haxton, C., Song, M., Zeiser, K., Berger, A., Turk-Bicakci, L., Garet, M. S., Knudson, J., & Hoshen, G. (2016). Longitudinal findings from the early college high school initiative impact study. Educational Evaluation and Policy Analysis, 38(2), 410–430.

    Article  Google Scholar 

  • Hayes, R. J., & Moulton, L. H. (2017). Cluster randomized trials (2nd ed.). Boca Raton: CRC Press.

    Google Scholar 

  • Hedges, L. V., & Rhoads, C. (2010). Statistical power analysis in education research (NCSER 2010-3006). National Center for Special Education Research, Institute of Education Sciences, U. S. Department of Education.

    Google Scholar 

  • Holland, P. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960.

    Article  Google Scholar 

  • Huber, M. (2012). Identification of average treatment effects in social experiments under alternative forms of attrition. Journal of Educational and Behavioral Statistics, 37(3), 443–474.

    Article  Google Scholar 

  • Hudgens, M. G., & Halloran, E. (2008). Toward causal inference with interference. Journal of the American Statistical Association, 103(482), 832–842.

    Article  Google Scholar 

  • Ihme, T. A., Sonnenberg, K., Barbarino, M., Fisseler, B., & Stürmer, S. (2016). How university websites’ emphasis on age diversity influences prospective students’ perception of person-organization fit and student recruitment. Research in Higher Education, 57(8), 1010–1030.

    Article  Google Scholar 

  • Konstantopoulos, S. (2008). The power of the test for treatment effects in three-level block randomized designs. Journal of Educational Effectiveness, 1(4), 265–288.

    Article  Google Scholar 

  • Konstantopoulos, S. (2009). Using power tables to compute statistical power in multilevel experimental designs. Practical Assessment, Research, and Evaluation, 14, 10.

    Google Scholar 

  • Kraft, M. A. (2020). Interpreting effect sizes of education interventions. Educational Researcher, 49(4), 241–253.

    Article  Google Scholar 

  • Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique. The Annals of Applied Statistics, 7(1), 295–318.

    Article  Google Scholar 

  • Lipsey, M. W., Puzio, K., Yun, C., Hebert, M. A., Steinka-Fry, K., Cole, M. W., Roberts, M., Anthony, K. S., & Busick, M. D. (2012). Translating the statistical representation of the effects of education interventions into more readily interpretable forms (NCSER 2013-3000). National Center for Special Education Research, Institute of Education Sciences, U. S. Department of Education.

    Google Scholar 

  • Logue, A. W., Watanabe-Rose, M., & Douglas, D. (2016). Should students assessed as needing remedial mathematics take college-level quantitative courses instead? A randomized controlled trial. Educational Evaluation and Policy Analysis, 38(3), 578–598.

    Article  Google Scholar 

  • Mattanah, J. F., Brooks, L. J., Ayers, J. F., Quimby, J. L., Brand, B. L., & McNary, S. W. (2010). A social support intervention to ease the college transition: Exploring main effects and moderators. Journal of College Student Development, 51(1), 93–108.

    Article  Google Scholar 

  • Millan, T. M., & Macours, K. (2017). Attrition in randomized control trials: Using tracking information to correct bias. IZA Discussion Paper Series No. 10711.

  • Moerbeek, M. (2005). Randomization of clusters versus randomization of persons within clusters: Which is preferable? The American Statistician, 59(1), 72–78.

    Article  Google Scholar 

  • Morgan, K. L., & Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. The Annals of Statistics, 40(2), 1263–1282.

    Article  Google Scholar 

  • Morgan, S. L., & Winship, C. (2015). Counterfactuals and causal inference: Methods and principles for social research (2nd ed.). New York: Cambridge University Press.

    Google Scholar 

  • Mowbray, C. T., Holter, M. C., Teague, G. B., & Bybee, D. (2003). Fidelity criteria: Development, measurement, and validation. American Journal of Evaluation, 24(3), 315–340.

    Article  Google Scholar 

  • Murnane, R. J., & Willett, J. B. (2011). Methods matter: Improving causal inference in educational and social science research. Oxford: Oxford University Press.

    Google Scholar 

  • National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. (1979). The Belmont report: Ethical principles and guidelines for the protection of human subjects of research. Washington, DC: U. S. Department of Health, Education, and Welfare.

    Google Scholar 

  • Nelson, M. C., Cordray, D. S., Hulleman, C. S., Darrow, C. L., & Sommer, E. C. (2012). A procedure for assessing intervention fidelity in experiments testing educational and behavioral interventions. The Journal of Behavioral Health Services & Research, 39(4), 374–396.

    Article  Google Scholar 

  • Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606.

    Article  Google Scholar 

  • O’Donnell, C. (2008). Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K-12 curriculum intervention research. Review of Educational Research, 78(1), 33–84.

    Article  Google Scholar 

  • Oreopoulos, P., & Dunn, R. (2013). Information and college access: Evidence from a randomized field experiment. Scandinavian Journal of Economics, 115(1), 3–26.

    Article  Google Scholar 

  • Oreopoulos, P., & Ford, R. (2019). Keeping college options open: A field experiment to help all high school seniors through the college application process. Journal of Policy Analysis and Management, 38(2), 426–454.

    Article  Google Scholar 

  • Page, L. C., Feller, A., Grindal, T., Miratrix, L., & Somers, M. (2015). Principal stratification: A tool for understanding variation in program effects across endogenous subgroups. American Journal of Evaluation, 36(4), 514–531.

    Google Scholar 

  • Paloyo, A. R., Rogan, S., & Siminski, P. (2016). The effect of supplemental instruction on academic performance: An encouragement design experiment. Economics of Education Review, 55, 57–69.

    Article  Google Scholar 

  • Pugatch, T., & Wilson, N. (2018). Nudging study habits: A field experiment on peer tutoring in higher education. Economics of Education Review, 62, 151–161.

    Article  Google Scholar 

  • Raudenbush, S. W., & Liu, X. (2000). Statistical power and optimal design for multisite randomized trials. Psychological Methods, 5(2), 199–213.

    Article  Google Scholar 

  • Raudenbush, S. W., Martinez, A., & Spybrook, J. (2007). Strategies for improving precision in group-randomized experiments. Educational Evaluation and Policy Analysis, 29(1), 5–29.

    Article  Google Scholar 

  • Rhoads, C. H. (2011). The implications of “contamination” for experimental design in education. Journal of Educational and Behavioral Statistics, 36(1), 76–104.

    Article  Google Scholar 

  • Rhoads, C. H. (2016). The implications of contamination for educational experiments with two levels of nesting. Journal of Research on Educational Effectiveness, 9(4), 531–555.

    Article  Google Scholar 

  • Rivas, M. J., Baker, R. B., & Evans, B. J. (2020). Do MOOCs make you more marketable? An experimental analysis of the value of MOOCs relative to traditional credentials and experience. AERA Open, 6(4), 1–16.

    Google Scholar 

  • Rosenbaum, P. R. (2007). Interference between units in randomized experiments. Journal of the American Statistical Association, 102(477), 191–200.

    Article  Google Scholar 

  • Rosinger, K. O. (2019). Can simplifying financial aid offers impact college enrollment and borrowing? Experimental and quasi-experimental evidence. Education Finance and Policy, 14(4), 601–626.

    Article  Google Scholar 

  • Rubin, D. (1980). Randomization analysis of experimental data: The Fisher randomization test comment. Journal of the American Statistical Association, 75(371), 591–593.

    Google Scholar 

  • Rubin, D. (1986). Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396), 961–962.

    Google Scholar 

  • Rubin, D. (2008). For objective casual inference: Design trumps analysis. The Annals of Applied Statistics, 2(3), 808–840.

    Article  Google Scholar 

  • Schnabel, D. B. L., Kelava, A., & van de Vijver, F. J. R. (2016). The effects of using collaborative assessment with students going abroad: Intercultural competence development, self-understanding, self-confidence, and stages of change. Journal of College Student Development, 57(1), 79–94.

    Article  Google Scholar 

  • Schochet, P. Z. (2008). Statistical power for random assignment evaluations of education programs. Journal of Educational and Behavioral Statistics, 33(1), 62–87.

    Article  Google Scholar 

  • Schochet, P. Z. (2013). Estimators for clustered education RCTs using the Neyman model for causal inference. Journal of Educational and Behavioral Statistics, 38(3), 219–238.

    Article  Google Scholar 

  • Schochet, P. Z., & Chiang, H. S. (2011). Estimation and identification of the complier average causal effect parameter in education RCTs. Journal of Educational and Behavioral Statistics, 36(3), 307–345.

    Article  Google Scholar 

  • Scrivener, S., Weiss, M. J., Ratledge, A., Rudd, T., Sommo, C., & Fresques, H. (2015). Doubling graduate rates: Three-year effects of CUNY’s Accelerated Study in Associate Programs (ASAP) for development education students. MDRC.

    Google Scholar 

  • Scrivener, S., Gupta, H., Weiss, M. J., Cohen, B., Cormier, M. S., & Brathwaite, J. (2018). Becoming college-ready: Early findings from a CUNY Start evaluation. MDRC.

    Google Scholar 

  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston/New York: Houghton Mifflin Company.

    Google Scholar 

  • Sinclair, B., McConnell, M., & Green, D. P. (2012). Detecting spillover effects: Design and analysis of multilevel experiments. American Journal of Political Science, 56(4), 1055–1069.

    Article  Google Scholar 

  • Smith, S. W., Daunic, A. P., & Taylor, G. G. (2007). Treatment fidelity in applied educational research: Expanding the adoption and application of measures to ensure evidence-based practice. Education and Treatment of Children, 30(4), 121–134.

    Article  Google Scholar 

  • Spybrook, J., Bloom, H., Congdon, R., Hill, C., Martinez, A., & Raudenbush, S. (2011). Optimal Design Plus empirical evidence: Documentation for the “Optimal Design” software.

  • Spybrook, J., Hedges, L., & Borenstein, M. (2014). Understanding statistical power in cluster randomized trials: Challenges posed by differences in notation and terminology. Journal of Research on Educational Effectiveness, 7(4), 384–406.

    Article  Google Scholar 

  • Tchetgen Tchetgen, E. J., & VanderWeele, T. J. (2012). On causal inference in the presence of interference. Statistical Methods in Medical Research, 21(1), 55–75.

    Article  Google Scholar 

  • VanderWeele, T. J., & Hernan, M. A. (2013). Causal inference under multiple versions of treatment. Journal of Causal Inference, 1(1), 1–20.

    Article  Google Scholar 

  • VanderWeele, T. J., Hong, G., Jones, S. M., & Brown, J. (2013). Mediation and spillover effects in group-randomized trials: A case study of the 4Rs educational intervention. Journal of the American Statistical Association, 108(502), 469–482.

    Article  Google Scholar 

  • Vazquez-Bare, G. (2019). Identification and estimation of spillover effects in randomized experiments. Working Paper.

  • Weiss, M. J., Mayer, A. K., Cullinan, D., Ratledge, A., Sommo, C., & Diamond, J. (2015). A random assignment evaluation of learning communities at Kingsborough Community College – Seven years later. Journal of Research on Educational Effectiveness, 8(2), 189–271.

    Article  Google Scholar 

  • What Works Clearinghouse. (2020). Standards handbook, version 4.1. Washington, DC: Institute of Education Sciences.

    Google Scholar 

  • Whitehurst, G. J. (2003, April 21–25). The Institute of Education Sciences: New wine, new bottles [Conference presentation]. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL, United States.

    Google Scholar 

Download references


I would like to thank Amanda Addison and Robin Yeh for providing research assistance in tracking down many of the references across fields cited in this chapter. The chapter also benefited from many helpful suggestions from an anonymous reviewer and the guidance of the Associate Editor, Nick Hillman.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Brent Joseph Evans .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Evans, B.J. (2021). Understanding the Complexities of Experimental Analysis in the Context of Higher Education. In: Perna, L.W. (eds) Higher Education: Handbook of Theory and Research. Higher Education: Handbook of Theory and Research, vol 36. Springer, Cham.

Download citation

  • DOI:

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43030-6

  • Online ISBN: 978-3-030-43030-6

  • eBook Packages: Springer Reference EducationReference Module Humanities and Social SciencesReference Module Education

Publish with us

Policies and ethics