At times, educational psychologists step into the role of research translators, in which we attempt to distill reasoned and reasonable recommendations for educational practice or policy from our scientific work (c.f., Alexander, 2013; Mayer, 2003; Renkl, 2013). For example, educational psychologists have recently forwarded practical recommendations concerning testing and measurement issues (e.g., McNeish & Dumas, 2019); instructional techniques (e.g., Reeve & Cheon, 2021); and diversity, equity, and inclusion efforts (e.g., Juvonen et al., 2019), among many areas. However, some in the educational psychology literature (e.g., Robinson et al., 2007, 2013) have argued that we may be inadvertently steering practitioners wrong by overstepping the data available to us.

A strongly worded paper in this line came from Brady and colleagues (2023). These scholars contended that educational psychologists should solely infer recommendations for practice from data drawn from experimental intervention studies. Although we agree with Brady and colleagues’ most fundamental point: that practical recommendations should be forwarded in a responsible way based on the evidence at hand; we do not agree that intervention research is the only form of inquiry that provides the evidence needed for practical recommendations. Instead—following in-line with other areas of science such as tobacco smoking (Sasco et al., 2004) or the atmospheric greenhouse effect (Nissani, 1996)—we argue that it is possible for researchers to make carefully considered recommendations based on multiple sources of observational and correlational data.

In the context of educational psychology, we suggest that structural equation modeling (SEMBentler, 1980; Hoyle, 2023; Jöreskog, 1978) is a methodological paradigm that can support justified recommendations for practice based on correlational data. We outline a five-phase process in which SEM, under consideration of recent methodological developments in the modeling of correlational data, can be used to carefully derive implications for practice and test their generalizability. This way, as the field moves forward, educational psychologists can continue to fulfill their roles as responsible research translators, while also taking full advantage of the psychological meaning in their correlational data.

Phase One: Adopt Meta-Theoretical Beliefs About the Mind 

A key meta-theoretical belief among psychologists who utilize SEM is that mental attributes can be indirectly measured by observing behavior (Bollen, 2002; Bollen & Hoyle, 2023; Borsboom, 2008). This meta-theoretical understanding can be tracked back more than a century, at least to the work of Charles Spearman (e.g., 1907) who posited that variance in observed behavior was underlain by two sources: latent psychological attributes that cause the behavior, and random measurement error inherent to the process of doing psychological research. This deceptively simple meta-theoretical belief can still be observed in SEM today, where behavioral indicators are endogenous to (i.e., affected by) their hypothesized latent psychological causes, and to their corresponding error terms (Hoyle, 2023; Mueller, & Hancock, 2019). To put it simply, SEM researchers believe that the mind causes behavior, and therefore behavior can be used to make inferences about the mind.

In addition, SEM meta-theoretically posits that the variance and covariance among behavioral indicators can be mathematically reduced to a smaller number of vectors or dimensions that represent the latent psychological variables being modeled (Thurstone, 1940). All leftover variance that is not represented in this smaller number of psychological dimensions is assumed to be caused by measurement error. Notably, there are other meta-theoretical beliefs about psychological phenomena; for example, some latent constructs could be caused or formed by indicator variables instead of the other way round (Schuberth, 2021), or behaviors captured by indicators could affect each other directly rather than indicate a common latent variable (van der Maas et al., 2006). In order to adopt an SEM-based research agenda, the first step is to decide which of these meta-theoretical assumptions best represents the psychological phenomena of interest. Although we limit the present discussion to working with latent variables that causally explain observed behavior, constructs for which other kinds of meta-theories are more appropriate can also be integrated into SEM (Epskamp et al., 2017; Schuberth, 2021).

Phase Two: Configure a Theoretically Meaningful Model

Next, a model must be built that, as faithfully as possible, represents how the phenomenon of interest unfolds. In this theoretical phase, a research team will closely interrogate the existing literature to form hypotheses about the interrelations among the latent variables in their model. What patterns of prediction and causes (Kline, 2023), what moderation (Kelava & Brandt, 2023), and what mediation may be occurring (Gonzalez et al., 2023)? What are the important covariates that also exist in this space—and help to cause variance and covariance in the behavioral indicators—that must be included in the model, and the inclusion of what other covariates might be misleading (Pearl, 2023)? If the possibility of causal conclusions is in focus, then longitudinal data that may be particularly helpful for integrating model structures into SEM that ensure biased results are less likely (Lüdtke & Robitzsch, 2022; McNeish et al., 2022). All pathways in a hypothesized model should have past citations or reasonably founded theoretical justifications associated with them, and ways in which your model deviates from the existing literature should be made explicit during publication.

In this way, the model is configured to match the most rigorous extant understanding of the phenomena being studied. In cases where multiple model configurations are roughly equally supported by the existing literature, both can and should be tested to determine which fits the data better (Preacher and Yaremych, 2023). To put it another way, the configuration of the SEM is meant to be an as-direct-as-possible translation of psychological theory into a statistical form, which is an important demand for making any statistical model informative (Robinaugh et al., 2021). In cases where researchers are not able to represent existing theory fully and faithfully in their model, perhaps because they did not measure all relevant psychological variables and covariates, the ways in which the model misses these theoretically relevant components should be made explicit as limitations. The more relevant covariates that are available to include in the model, the more alternative explanations for correlations in our data can be ruled out and indirect evidence for causation can be gathered (Reiss, 2015).

Phase Three: Interpret Model-Data-Fit Indices and Model Coefficients

For SEM researchers, it is the model’s job to reproduce the patterns in the observed dataset by estimating values for all model parameters that best imply the observed variances and covariances. Fit indices are then computed and used to examine how well the model-estimated parameters manage to describe the variances and covariances of the observed variables (West et al., 2023). These fit indices can be used to evaluate whether the theoretical considerations that have been used to define an SEM have led to parameters that appropriately manage to describe the observed data.

Of note, which fit indices and cut-offs are most appropriate depends on the research context, for example on the nature and number of the variables in the model, and the amount of residual variance in the observed variables (Hancock & Mueller, 2011; Heene et al., 2011). Vast literature is available to guide researchers in selecting from available fit indices and to determine which cut-offs on these indices might indicate mild or substantial deviations of the empirical data-patterns from the patterns implied by the model (see e.g., Greiff & Heene, 2017; McNeish & Wolf, 2021; West et al., 2023).

If a model fits the data well, estimated coefficients in that model can hold rich psychological information about students. For instance, through the model loadings, SEMs can depict how closely behavioral indicators relate to their corresponding latent variables (Bollen & Hoyle, 2023). Over and above the actual configuration of the structural model, which describes how the latent variables are hypothesized to interrelate, the structural coefficients capture the direction and degree of those interrelations. If a researcher has adopted the meta-theoretical beliefs described above and has configured their model to as-closely-as-possible represent psychological theory, these structural coefficients represent the essential make-up of the human mind vis-à-vis the theory being tested.

Phase Four: Replicate Across Contexts

In educational psychology, phenomena typically vary across contexts (Berliner, 2002; Hedges, 2013; Plucker & Makel, 2021). For this reason, even the most well-supported models remain context-dependent theories and not laws in perpetuity. Appropriate fit across contexts supports a model’s capability to be a reliable and valid tool for prediction and for deriving implications for practice. In contrast, contexts from which the available data do not fit the model can be taken as potential boundary conditions to the validity of the existing model. The model would then need to be changed (e.g., Harackiewicz et al., 2002) or extended (e.g., Merk et al., 2018; Wolff et al., 2019) to accommodate the new data.

In cases where the theory (and the SEM that represents it) needs to be amended based on newly available data from an additional educational context, the hypothesized reasons why those amendments were needed should also be incorporated into the theory. This opens the door to the possibility of testing the new wider theory by modeling patterns of variance and covariance across contexts via approaches such as multi-level (Heck & Reid, 2023), multi-group (Widaman & Olivera-Aguilar, 2023), meta-analytic (Cheung, 2023), local (Hildebrandt et al., 2016), or moderated SEM (Molenaar, 2021). This process of broadening what is known about the phenomena being studied, whether by identifying contexts in which the model holds as it is, or contexts where it needs to be amended, is the expansion of our scientific understanding in educational psychology. Some examples of this process within our field might be the Big-Fish-Little-Pond effect (Werts & Watley, 1969) or the g-factor of cognitive ability (Spearman, 1927), which have been replicated in many populations around the world but have failed to hold up in others (Guilford, 1964; Seaton et al., 2009; Warne & Burningham, 2019).

Importantly, SEM, as any other methodological approach, is not a technique in isolation. The findings from SEM-based studies should always be compared with evidence from other approaches to see whether under methodological pluralism, findings hold up and allow arriving at robust conclusions (Oreskes, 2019). When studies that have undertaken approaches such as controlled trials, inter- and intraindividual perspectives (which can both be integrated in SEM; see Asparouhov et al., 2018), and qualitative/mixed methods approaches all yield comparable results, then we can be sure that we have modeled robust phenomena rather than tendencies of specific approaches to yield specific patterns (Eid et al., 2023). Generalization across populations and across methods are both required to arrive at the most robust conclusions.

Phase Five: Support Practitioners in Reasoning About the Evidence

Whenever educational psychologists communicate with practitioners or policymakers, care and caution are warranted so as not to overstate the evidence. For instance, good recommendations should only involve the variables included in an SEM and not make any conjecture about unmeasured variables external to the model. In addition, the actual strength of the coefficients in the model should be carefully communicated so that practitioners can understand what the numbers mean in terms of educational practices and outcomes. The degree to which the model fits the data—if the fit is excellent or if it approaches the borderline of quality standards—should also be communicated, along with the extant contexts across which the model has been replicated. In this way, practitioners can receive information that allows them to reason about the strength of the evidence and make their own decision about whether to believe a theory and adopt it in their practice.

It is important to consider the educational background of the practitioners to whom we communicate the most frequently. For example, school psychologists, educational specialists, and classroom teachers, might have very different educational backgrounds and knowledge about the methods utilized in a study. Literature on communicating statistical information (e.g., Schmidt et al., in press) and examples from Clearinghouses (WWC, 2022) can help in designing the communication of results from SEM-based research such that practitioners can validly interpret the information and can use it for evidence-informed decisions in their educational practice (Greisel et al., 2023).

Conclusion

Instead of completely avoiding the derivation of practical recommendations from observational or correlational data, we should do so in order not to lose potentially useful and valid information (Grosz et al., 2020), and we should bring the utmost care to this endeavor. Intervention studies and correlational designs have their own unique strengths and weaknesses when it comes to the validity of inferences. Eventually, whether practical implications can be drawn is not a yes/no question. All kinds of studies should be seen as lying on a continuum from “only observational statements possible” to “valid evidence for deducing recommendations for practice”. The better observational studies are designed, and their data analyzed in light of theory, taking into account appropriate covariates and harnessing modern modeling approaches, the more we can be sure to be further toward the latter end of this continuum.