Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Throughout this book, the integration of learning and working takes a central role. Research into the learning potential of the workplace and the characteristics that promote or impede learning has already offered interesting insights (e.g. Hurtz & Williams, 2009; Kyndt & Baert, 2013; Lohman, 2000). Billett (2001) states that the workplace has to be designed in a way that people are invited and stimulated to learn. In addition, Tynjälä (2008) rightly pointed out that learning results from the interaction between the workplace and the individual: ‘While the organisation of work sets the context and conditions for learning, it continues to be the reciprocal interaction between the individual and the workplace that determines learning’ (Tynjälä, 2008, p. 141). When investigating professional learning, it is therefore important to pay attention to possible interactions and reciprocal and mediating relationships. Researching professional learning within the paradigm of the integration of work and learning is interesting as it captures the complexity of workplace learning. However, it does require more advanced statistical techniques that are able to model this complexity. Structural equation modelling (SEM) is one of the techniques that enable the examination of more complex relations. This book chapter aims at providing a basic introduction to SEM without using mathematical formulas and going into all the specific technicalities while at the same time staying true to the complexity of the presented analysis.

The current book chapter will start with presenting SEM at a conceptual level. We will present why and when SEM could be used, what the advantages and disadvantages are in comparison with regression analysis and which different types of models can be analysed with SEM. Subsequently, the analyses will be illustrated on a dataset that was collected to investigate the approaches to learning of employees in relation with their work motivation, perceived workload and choice independence. Furthermore, several applications of SEM within the research of professional learning will be discussed. Finally, the main conclusions will be summarised.

2 Structural Equation Modelling (SEM)

Structural equation modelling denotes a family of multivariate techniques including and combining factor analysis and path analysis in which the focus lies on theoretical constructs represented by latent factors (Hox & Bechger, 1998). Latent factors are unobserved constructs that are reflected by a set of observed variables. Within this section we will first focus on why and when SEM could be used. The different models within SEM will be introduced, and the assumptions of the analysis along with the conditions for conducting SEM will be discussed. Following, the fit indices used to evaluate the models are presented, and the use of modification indices for model improvement is considered.

The core of the SEM analysis involves specifying a theoretical model and subsequently testing whether this model is plausible given the sample data. This comparison is based on the comparison of the variance-covariance matrix of the theoretical model to the variance-covariance matrix that is observed in the sample data (Crockett, 2012). Therefore, SEM is sometimes also known as covariance structure analysis. SEM thus examines a model that represents the linear relationships among variables. Because SEM is based on the analysis of covariances, a SEM model in itself cannot establish causal effects (see Sect. 14.5).

2.1 Why and When Should We Use SEM?

The main reason for applying SEM instead of traditional regression analysis is the flexibility and ability to model more complex relationships between constructs. With SEM it is possible to specify ‘path models with intervening variables between the independent and dependent variables, and latent factor as well’ (Hox & Bechger, 1998, p. 6). Although the method of Baron and Kenny (1986) offers an alternative for assessing whether or not a variable mediates the relationship between two other variables, from a statistical point of view, analysing the different paths simultaneously will yield better results (Iacobucci, 2009). The estimates of the strength of the relationships are more precise, and there is less bias as each effect is estimated together with the other effects (Iacobucci, 2009). In other words, the same variance cannot be estimated twice, as it is the case when separate regression analyses are applied.

Chin (1998) states that for any given SEM model, alternative models that fit the data as well as the proposed model can be found that potentially provide substantially different explanations of the data. Therefore, it is important to note that SEM is traditionally not recommended for exploratory purposes. Clear hypotheses about the structure of the data both in terms of factors as in terms of paths between constructs are needed for sound and replicable applications (Hox & Bechger, 1998). The paths included in the model should be theoretically justified (Chin, 1998). Additionally, the more complex the specified model is, the higher the requirements in terms of sample size become. This issue will be discussed within the section focusing on the conditions that need to be fulfilled for the analysis.

SEM models comprise a measurement model and structural model. The measurement model relates the observed or manifest variables to the latent constructs while the structural or path model denotes the paths/relationships between the constructs. Fitting a measurement model is also known as confirmatory factor analysis (Hox & Bechger, 1998; Iacobucci, 2009). Full SEM models combine the two into one model. When performing SEM analyses, it is always convenient to start by drawing a path diagram; it can guide the analysis. Within this chapter, the generally accepted notation for representing SEM models will be used (Tacq, 1997). Within this notation, boxes represent the observed variables (e.g. the items in your questionnaire) and circles depict the latent constructs (e.g. the underlying construct you are trying to measure using different items). Within the measurement model, the arrows originate from the latent construct and point to the observed variables. The underlying idea is that the latent construct gives rise to or is reflected in the observed variables (Chin, 1998; Hox & Bechger, 1998; Iacobucci, 2009). Within the path model the single-headed arrows reflect the directional relationship between two constructs; double-headed arrows depict covariances.

Figure 14.1 represents a measurement model with three latent constructs and eight observed variables. The arrows on the left side of the observed variables indicate the residual error term originating from the fact that the observed variation is not completely explained by the latent construct.

Fig. 14.1
figure 1

Conceptual model for a confirmatory factor analysis

Figure 14.2 depicts a conceptual model for a path analysis. The illustration shows a simple mediation model in which one variable mediates the relation between two other variables. Within a full SEM model (see below), the paths would connect latent constructs. A path analysis reflects the directional relations between observed variables (Cohen, Manion, & Morrison, 2011).

Fig. 14.2
figure 2

Conceptual model for a path analysis with single mediation between two observed variables

Figure 14.3 shows a full SEM model in which the measurement and the structural model are combined into one analysis.

Fig. 14.3
figure 3

Conceptual model for a full SEM model

2.2 Which Conditions Should Be Fulfilled?

When applying SEM and making inferences based on the analysis, it is important to consider the underlying assumptions of the analysis. First, SEM imposes the same statistical assumptions on the data as traditional regression analysis. In addition, multivariate normality is assumed. Secondly, SEM assumes that the correct model has been specified, meaning that no relevant variables are missing and that the directional relationships are specified correctly. A correct model specification foremost depends on the theoretical grounds of the research at hand. Moreover, the observed variables in SEM are assumed to reflect the latent construct and not to cause it (Kline, 2012). This also explains why the arrows in Figs. 14.1 and 14.3 point from the latent construct towards the observed variables and not vice versa. Changes in the latent construct should be reflected in all observed variables as they are conceptually related to each other. The observed variables in the SEM model cannot be indicators that compensate each other to form an artificial index or composite score (Kline, 2012). In the latter case of formative indicators, a change in one observed variable could cause a change in the construct but does not necessarily result in a change in the other observed variables. An example of a formative measure is when the success of an organisation is measured through the combination of an organisation’s annual profits, the increase in number of staff members and an indication of the popularity of the company. It could be that the company grows more popular and at the same time a decrease in number of staff members occurs. The profits, number of staff members and popularity of the company might be good indicators, but they do not reflect a latent construct, as they are conceptually unrelated.

After the model has been specified, it is important to check whether the model can be identified, that is, ‘whether a unique solutions to the model can be generated’ (Crockett, 2012, p. 36). Model identification can be executed by following two guidelines developed by Bollen (1989 in Crockett, 2012). First, the structural model should be recursive; this means that all relationships within the structural model are unidirectional and no feedback loops are included. In other words, the dependent variables in the model cannot be a cause and an effect at the same time (Crockett, 2012; Kline, 2012). In addition, the observed variance-covariance matrix must contain more unique elements than the number of parameters that need to be estimated (i.e. factor loadings, latent constructs, paths between latent constructs, etc.). The number of unique elements in the variance-covariance matrix can be calculated using p(p + 1)/2, where p equals the number of observed variables. The latter guideline is also known as the t rule (Crockett, 2012).

The invalidating effect of violating the statistical assumptions that SEM makes can be reduced by fulfilling conditions regarding sample size and missing values.

Sample size. The debate about the appropriate sample sizes for SEM is ongoing. For every ‘rule of thumb’ that exists, another occurs. In general, the rule is that the more complex the model, the more parameters that need to be estimated, the larger the sample size needs to be and, of course, larger is better (Iacobucci, 2010). The most correct and accurate method to assess the sample size is to assess the power of the analysis, as sample size depends on the specifications of the SEM model at hand (Chin, 1998). More information on power analysis can be found in MacCallum, Browne and Sugawara (1996). Iacobucci (2010) argues that the vague rule of thumb that the sample needs to be larger than 200, which was commonly accepted a while ago, is conservative and oversimplistic. In her article, she therefore argues that small samples of 50–100 could suffice. However, we would like to emphasise that this is only the case when you are testing a simple model with strong effects. One could wonder whether these simple models depict the true effects in an accurate way. The literature does offer some interesting rules of thumb that give an indication of an appropriate sample size in which the number of constructs or estimated parameters are taken into account. For the measurement model, the ratio of the sample size to the number of observed variables should at least be 10:1 (Hair, Black, Babin, Anderson, & Tatham, 2006). Bentler and Chou (1987) recommend that the ratio between the sample size and the number of parameters that need to be estimated should also be 10:1 or higher. More information on conducting SEM with small samples can be found in the article of Bentler and Yuan (1999). In addition, alternative estimation methods such as partial least squares (PLS) exist that are appropriate for small samples (see Sect. 14.4).

Missing values. By default, SEM only uses the data of participants without missing values. This approach assumes that if the dataset contains missing values, these values are missing completely at random. If this is not the case, one could adopt more advanced methods for handling missing values. For more information on this topic, the reader is referred to Allison (2003).

There is some evidence that SEM is robust to violations of the statistical assumptions if the sample size is large (more than 200 independent observations) and there are no missing values (Hsu, Chen, & Hsieh, 2006; Hu, Bentler, & Kano, 1992; Yuan & Bentler, 1999; Yuan & Zhong, 2013). However, with highly discrete and/or skewed data, especially if sample size is small or moderate, it is recommended to apply more robust estimation techniques and alternative statistics (for more information on these alternatives, see Bentler & Yuan, 1999; Jung, 2013; Kline, 2012; Satorra, 1990).

2.3 Fit Indices

A wide variety of fit indices have been proposed to evaluate the proposed model in terms of goodness of fit and simplicity of the model. Some fit indices emphasise the fit of the model to the data, while others take into account whether the model is parsimonious (Hox & Bechger, 1998; Iacobucci, 2010). In general, there is some agreement on which fit indices should be reported. First, there is the chi-square test, which is the only inferential measure. The null hypothesis of the chi-square test is that the model fits the data, meaning that to conclude that your model fits the data the chi-square test should not be significant. However, the chi-square test is very sensitive to sample size (Hox & Bechger, 1998). As a consequence, when working with large samples, the statistical test will be significant in almost all real data applications (Hox & Bechger, 1998; Iacobucci, 2010). One might think that it would be advisable not to work with large samples; however, this is not a valid advice, as a sufficiently large sample size is necessary to support the precision of the parameter estimation (Iacobucci, 2010). Alternatively, it has been suggested that fit is acceptable when the ratio of the chi-square test statistic to the degrees of freedom is not larger than 3: χ 2/df ≤ 3 (Iacobucci, 2010).

Due to the sensitivity of the chi-square test, the fit of the model is always evaluated based on multiple alternative indices. Because all goodness-of-fit indices are some function of the chi-square test, the majority of these indices are also subjected to the sample size but to a much smaller degree than the chi-square test (Hox & Bechger, 1998). As the following indices are not inferential, no statistical hypothesis testing is involved, only guidelines or ‘rules of thumb’ can be offered (Hu & Bentler, 1999; Iacobucci, 2010). Table 14.1 summarises the different guidelines that are offered in the standard methodological literature. Below we will describe the most commonly used cut-off values. In general, authors are advised to report the comparative fit index (CFI) that captures the relative goodness of fit in comparison to a simpler model. In a sense it indicates whether making your model more complex actually pays off. Preferably, the CFI is close to or higher than .95 (Hu & Bentler, 1999; Iacobucci, 2010), but values starting from .90 are considered acceptable (Iacobucci, 2010). Some authors also report the Tucker-Lewis Index (TLI) also known as the Non-normed Fit Index (NNFI), which follows the same rules of thumb as the CFI. Next to the chi-square test and CFI, the SRMR or Standardised Root Mean square Residual is usually reported. The SRMR actually indicates to what extent your model does not fit the data. The higher the value, the worse the model fit. This SRMR largely depends on the factor loadings in the measurement model and is less prone to violations of the distributional assumption. The maximum value of this index equals 1 and (very) low values are preferred. Values below .08 indicate an acceptable model. Finally, most researchers also report the Root Mean Square Error of Approximation (RMSEA) and its 90 % confidence interval. The RMSEA was developed to provide an indication of the extent to which the model does not match the true model. Small values indicate a good match. An RMSEA smaller than .06 is advised, although a value lower than .08 can also be considered acceptable (Browne & Cudeck, 1993; MacCallum et al., 1996).

Table 14.1 Cut-off values for fit indices

2.4 Modification Indices

When the model fit is not satisfactory, a researcher could choose to modify his model. This modification can be based on the parameter estimates (e.g. removing nonsignificant paths or observed variables with low loadings) or on the modification indices that indicate which parameter(s) should be added to the model to improve the model fit and how much the chi-square statistic is expected to minimally decrease when that parameter would be added (Hox & Bechger, 1998; Iacobucci, 2009). There is some debate on the use or misuse of modification indices (Hox & Bechger, 1998; Iacobucci, 2009). At first glance, these modification indices appear to be very helpful. Often these modification indices are used to improve the model fit sequentially, up till the point that the model meets the requirements. A strong advice regarding the use of the modification indices for improving the model fit is that alterations to the hypothesised model based on the data should only be done when there are theoretical grounds that support these alterations (Chin, 1998; Hox & Bechger, 1998). However, some researchers seem to be very creative in evaluating the theoretical justification post hoc, leading to potentially incorrect models or models that are difficult to cross-validate because they rely too much on the data of the sample at hand (Hox & Bechger, 1998; Iacobucci, 2009). It is therefore proposed to use modification indices prudently and to compare different a priori constructed models (Hox & Bechger, 1998).

3 Analysing and Interpreting Data

Within this section the goal is to offer some guidelines for making decision about the data and the steps that could be followed within the analysis, as well as the interpretation of the results. However, bear in mind that how the model is built and the decisions taken within this process need to be theory driven. Crockett (2012) describes five sequential steps within SEM. The first two steps were discussed above: model specification and model identification. Subsequently, the model is estimated. Different estimation procedures can be used for the estimation of the variance-covariance matrix of the model, within this chapter the maximum likelihood estimation will be used. For an introduction into the different estimation methods, we refer the reader to Crockett (2012). Below, we will foremost focus on the fourth and fifth step: model testing and model modification. Model testing involves the evaluation of the plausibility of the theoretical model given the sample data (Crockett, 2012). This evaluation is based on multiple fit indices (cf. infra). The final step of model modification was already shortly introduced above and will also be illustrated below.

3.1 Illustration: Concept and Sample

The analyses will be illustrated with a data set that was collected to investigate the relationship between employees’ approaches to learning at work and their work motivation, perceived workload and choice independence.

3.1.1 Concepts and Measurement Instruments

Within the literature three approaches to learning at work are distinguished: a deep approach, a surface-rational approach and a surface-disorganised approach. A deep approach to learning refers to the combination of an eagerness to learn and the use of integrative strategies that contribute to personal understanding. The surface-rational approach reflects a preference for orderly, accurate and detailed work achieved by using surface learning strategies such as memorisation and a step-by-step approach. Finally, the surface-disorganised approach is considered a nonacademic orientation in combination with surface motives. It is associated with feeling overwhelmed and a sense of incompetence when executing task. These approaches to learning at work were measured by means of the Approaches to learning at Work Questionnaire (Kirby, Knapper, Evans, Carty, & Gadula, 2003).

Work motivation was conceptualised from the perspective of the self-determination theory. Within this study the focus lied on the reasons why someone does a particular job and a distinction between autonomous and controlled motivation is made. Work motivation was measured with the Motivation at Work Scale (Gagné et al., 2010). Perceived workload and choice independence were measured with the Workplace Climate Questionnaire (Kirby et al., 2003). The complete theoretical background, the rationale and the results of the actual study can be found in the article of Kyndt, Raes, Dochy and Janssens (2013).

3.1.2 Sample

The sample consisted of 358 employees from diverse companies (59 % female). The majority of the participants were employed in profit organisations (52 %), and 38 % were employed in nonprofit or social profit organisations (e.g. healthcare). The remaining 10 % of the participants were employed within the public sector. Participants were between 20 and 64 years old (M = 37.85, SD = 10.64); on average they had 11.22 years of seniority (SD = 10.16). Most respondents had a permanent full-time contract (83 %); others worked part time (14 %) or had a temporary contract (3 %). Finally, participants’ initial level of education was diverse: ‘1 % did not obtain a diploma or finished elementary school, 25 % obtained a secondary degree, 40 % obtained a bachelor’s degree (professional or academic), and 34 % obtained a master’s degree’ (Kyndt et al., 2013, p. 278).

3.1.3 Software and Output

For this illustration the analyses were performed with the lavaan package of R (Rosseel, 2012). R is a free software for statistical computing that can be downloaded from www.R-project.org (R Development Core Team, 2012). Figure 14.4 was plotted by means of the qgraph package (Epskamp, Cramer, Waldorp, Schmittmann, & Borsboom, 2012). The R code of this example can be found in the appendix. SEM analysis can also be performed with the AMOS package (Extension of SPSS; Arbuckle, 2011), SAS Calis procedure (SAS Institute Inc, 2008), Mplus (Muthén & Muthén, 1998–2010), EQS (Bentler, 2004) or Lisrel (Jöreskog & Sörbom, 1997). For this illustration, we chose to present the output as given by R so that the reader would recognise these output when undertaken the analysis themselves. This output presents more information than discussed within this introductory chapter; therefore, we have marked the values on which the interpretations are based. When performing the analysis in R, the first steps that need to be undertaken are setting a working directory, loading the data and installing the necessary packages.

Fig. 14.4
figure 4

Measurement model with standardised coefficients

3.2 Measurement Model: Confirmatory Factor Analysis

We will illustrate the confirmatory factor analysis with the simplest measurement that was included within the study, that is, the measurement of perceived workload and perceived choice independence of the employee (Kirby et al., 2003). Both constructs were measured by five items (i.e. observed variables) that were scored on a five-point Likert scale (see Table 14.2).

Table 14.2 Items measurement model

3.2.1 Confirming the Model

The results of the CFA show that the hypothesised measurement model fits the data reasonably well (Output 1). Although the chi-square test is statistically significant (χ 2 = 98.223, df = 34, p < .001), the ratio between the test statistic and degrees of freedom (98.223/34 = 2.89) is below 3. In addition, the CFI and TLI, respectively, equal .94 and .92, which is above the proposed cut-off of .90. The SRMR and RMSEA are rather high (SRMR = .075, RMSEA = .073, CI 90 % [.056, .09]), but both are acceptable, although it is not a good sign that the values within the confidence interval exceed .08.

figure a

Output 1 CFA all items

In sum, the measurement model shows an acceptable fit, but there is room for improvement. Because the SRMR is rather high, it could be interesting to examine the factor loadings of the different items and delete item(s) with low factor loadings. Ideally the standardised values of the factor loadings (see std.all in output) are around or above .50 (Hair et al., 2006; Maruyama, 1998). Within this solution, only item WL3 has a factor loading below .50. If we look at the questions in Table 14.2, we can observe that WL3 focuses on a different aspect in comparison with the other four items. WL3 is the only item that considers learning. Therefore, we decided to test a second measurement model in which item WL3 was excluded. The results show an improved fit (Table 14.3 and Output 2).

Table 14.3 Model fit without item WL3
figure bfigure b

Output 2 CFA without WL3

Because both models show an acceptable fit, both could be accepted. The decision for the most appropriate measurement model should foremost be guided by theory. In this case, one could consider whether or not investing time in learning is an important aspect of the construct of perceived workload. In addition, you could also test which model is superior in a statistical sense. Because the first model can be considered an extension of the second model (i.e. the models are nested), the chi-square test for model comparison can be applied. By using a simple formula in Excel (=CHIDIST(Δchi-statistics Footnote 1; Δdegrees of freedom)), you can calculate whether the difference between the chi-square statistics of both models is statistically significant. In this example, the difference between the chi-square statistics equals 98.223−62.391 = 35.832, and the difference between the degrees of freedom equals 34−36 = 8. The significance test reveals that this difference is statistically significant (p < .001) indicating the second model is statistically superior to the first (a smaller chi-square value indicates a better fit). From a theoretical point of view, the conclusion is also supported because the emphasis within perceived workload lies on general workplace conditions, regardless of the fact that the effort had to be undertaken for learning specifically. Figure 14.4 depicts the measurement model.

3.2.2 The Stability of the Model: Measurement Invariance

Recently, more and more attention has been given to the stability of the measurement model (Boeve-de Pauw, Jacobs, & Van Petegem, 2012; Coertjens, Donche, De Maeyer, Vanthournout, & Van Petegem, 2013). If the goal is to compare the constructs across groups and/or over time – which is often the case – it is important to determine whether a questionnaire measures the same constructs with the same structure across groups and/or over time. When measurement invariance is established, it can be accepted that different groups of participants (e.g. males and females) or the same participants across different measurement moments (longitudinal measurement invariance) interpret the individual questions and underlying constructs in a similar way.

Different levels (less to more demanding) of measurement invariance are described: configural, metric, scalar, and strict invariance. Configural invariance (Model 1) refers to the fact that the basic model structure is invariant across groups (Boeve-de Pauw et al., 2012). It shows that the participants conceptualise the constructs similarly. It basically shows that the same structure – the same items belong to the same construct – holds for both groups or both measurement moments. Configural invariance, however, does not ensure that the separate items are interpreted similarly because the factor loadings of the items can be different across groups or measurement moments.

To test whether metric invariance is achieved, one can compare the configural model to a model in which the factor loadings are constrained (Model 2) to be equal for each group or at each measurement moment. If constraining the factor loadings does not result in a significantly less good fit of the model, metric invariance can be claimed. A model is considered as good as the previous model if the difference between the CFIs of both models is smaller than .01. Ideally the difference in the chi-square test is also not significant; however, it is known that the chi-square statistic is sensitive to the size of the sample (Iacobucci, 2010).

Metric invariance indicates that the items are interpreted in a similar way across groups or measurement moments. When the aim is to compare means of the latent constructs across groups, it is necessary to achieve scalar invariance. Scalar invariance indicates that differences in means of the observed items are a consequence of the differences in means of the latent constructs. To identify scalar invariance, the model in which both the loadings and the intercepts of the items are constrained to be equal across groups or measurement moments (model 3) is compared to the model in which only the loadings (model 2) are constrained. When the model with constrained loadings and intercepts results into a too large decrease of the CFI, one could explore the option of freeing some of the intercepts and achieving partial intercept invariance. Advanced statistical models such as multiple-indicator growth analysis in case of longitudinal analysis can take these intercept variances into account (e.g. Coertjens et al., 2013). Finally, one can also check for invariance in error variances (strict invariance). However, in practice full measurement invariance is rarely achieved nor necessary (Boeve-de Pauw et al., 2012).

Measurement invariance across groups can be tested relatively easy in R with the semTools package (Pornprasertmanit, Miller, Schoemann, & Rosseel, 2013). The specific commands can be found in the Appendix.

figure c

Output 3 Measurement invariance

Within our illustration, we checked whether males and females interpreted the items and constructs in a similar way. In other words the measurement invariance across males and females was tested. The results can be found in Output 3 and are preferably reported by means of a table (see Table 14.4). The results in Table 14.4 show that the measurement of perceived workload and choice independence reaches scalar invariance. The differences between the CFIs are below .01 and the chi-square test for model comparison show that the different models do not differ significantly from each other.

Table 14.4 Measurement invariance for males and females

Establishing longitudinal measurement invariance follows the same procedure as establishing measurement invariance over groups:

  • Testing configural invariance

  • Constraining factor loadings to be equal and comparing this to the configural invariance model (metric invariance)

  • Constraining factor loadings and intercepts to be equal and comparing this to the metric invariance model (scalar invariance)

However, the difference with establishing measurement invariance across groups is that longitudinal measurement invariance is assessed for each scale separately and not the instrument as a whole (Coertjens, Donche, De Maeyer, Vanthournout & Van Petegem, 2012). Currently, longitudinal measurement invariance cannot be assessed by means of an R-package. This type of analysis is foremost executed with the Mplus software (Muthén & Muthén, 1998–2010). In our article on the development of a self-assessment instrument for the generic competences of vocational education students, an example of longitudinal measurement invariance testing relevant for the field of professional learning can be found (Kyndt et al., accepted).

3.3 Structural Model: Path Analysis

An illustration of a model including the two motivational scales as mediating variables is provided below. Figure 14.5 represent the path diagram of the model under examination.

Fig. 14.5
figure 5

Path diagram illustration path analysis (Note: WL workload, CI choice independence, Auto. autonomous motivation, Contr. controlled motivation, DA deep approach)

The path analysis including the two mediating variables shows a fit that is not optimal. Moreover, because the model includes almost every possible relationship, the model cannot be considered parsimonious. The poor model fit is evident from a ratio between the chi-square and degrees of freedom that is too large (χ 2/df = 5.92) and an RSMEA of .12 which is also too large. Simplifying the model by excluding nonsignificant paths could be a solution to this problem. Based on the output (Output 4), we decided to exclude the path between controlled motivation and a deep approach to learning, and the path between workload and autonomous motivation. Because the interest lies in predicting employees’ deep approach to learning and controlled motivation was inserted as a mediator between the perception of workload and choice independence, the paths from workload and choice independence to controlled motivation were also removed (Output 5). The new model that will be tested is represented by a path diagram in Fig. 14.6.

Fig. 14.6
figure 6

Path diagram illustration path analysis without nonsignificant paths

figure e

Output 4 Path analysis with mediating variables

This more parsimonious model shows a good fit (χ 2/df = 1.82, CFI = .99, TLI = .98, RMSEA = 0.048, CI 90 % [0; .058], SRMR = .02). All included paths are significant, indicating that perceived workload and choice independence predict employee’s deep approaches to learning significantly. In addition, autonomous motivation was found to mediate this relationship.

figure f

Output 5 Path analysis mediating variables without nonsignificant paths

3.4 Full SEM Model

Finally, a full SEM model combines a measurement model and structural model that were presented above. As an illustration, the full SEM model of final path analysis will be examined (see Fig. 14.7).

Fig. 14.7
figure 7

Path diagram full SEM model

The results show that this model does not adequately fit the data (χ 2/df = 2.47, CFI = .85, TLI = .84, RMSEA = 0.065, CI 90 % [.059; .071], SRMR = .075). The modification indices were checked to examine whether the model could be improved (Output 6).

figure gfigure g

Output 6 Full SEM model

The modification indices (Output 7) showed that the model could be improved by adding covariances between several observed variables. Only combinations of observed variables that reflected the same latent constructs were included because they can be considered in accordance with our theoretical model. For presentation purposes, only an excerpt of the output was included.

figure ifigure i

Output 7 Modification indices full SEM model

These modifications (Output 8) result in an acceptable model fit (χ 2/df = 1.94, CFI = .91, TLI = .90, RMSEA = 0.052, CI 90 % [.045; .059], SRMR = .064). When reporting on SEM analysis, it is not necessary to provide all coefficients of every model that was tested. It is however important that the reader gets an overview (with fit indices) of the different models that were tested. Only the coefficients of the final model should be reported. A table containing the coefficients, standardised coefficients, critical ratio and significance (level) is usually included. Table 14.5 provides an example of how the results of the final full SEM model could be presented.

Table 14.5 Coefficients final full SEM model
figure kfigure kfigure k

Output 8 Full SEM model after modification

4 Extensions of SEM and Their Application in Research on Professional Learning

Confirmatory factor analysis, path analysis and structural equation modelling have been commonly applied by researchers in various fields, including the field on professional learning. ‘Basic’ SEM models already offer a variety of possibilities because different types of relationships can be modelled based on the theoretical foundations of the study. In addition various extensions or specific forms of SEM offer a wide range of possibilities. Within this section, these extensions will be presented accompanied by examples of how these analyses can be applied within the field of professional learning.

Within the section on confirmatory factor analysis, we discussed the measurement invariance across groups. If this measurement invariance across groups is established, you could also investigate whether the SEM model is equal or different for different groups of participants. Establishing measurement invariance is important in this regard, as you would want to make sure that the differences that you find in the model reflect differences in the true constructs and relationships and not mere differences in measurement. With multiple-group SEM, it is possible to establish whether models are different for different groups of participants. The study of Hurtz and Williams (2009), for example, applied a multiple-group path analysis. This study examined attitudinal and motivational antecedents of employees’ participation in development activities based on data collected within four organisations. A multiple-group path analysis was used to investigate whether the model differed for the four different organisations that were included in their study. They concluded that although the strengths of the different paths differed across organisations, the patterns were the same as the full SEM model that was based on the data from all four organisations (Hurtz & Williams, 2009). Within their study, they also examined whether their measurement was invariant across the groups and concluded that this was not the case. However, because the focus lied on identifying a general model looking at the relationship of the antecedents with participation and not on explaining organisational differences, Hurtz and Williams (2009) applied group-mean centring to remove organisational mean differences: they computed the difference between the employee score and the organisational mean of the scores and took this difference as their primary outcome variable. After applying this group-mean centring, measurement invariance was reached. However, it must be noted that no conclusions can be drawn regarding possible organisational differences. Alternatively to this approach, on the condition that metric invariance is achieved and the sample size is large enough, one could also adopt a full SEM model, sometimes also called multiple-indicator SEM model, because this allows the modelling of intercept variances. Multiple-group SEM analysis can also be applied to compare the models of males and females, high- and low-qualified employees, etc. In contrast, to multilevel SEM analysis (see below), multiple-group analysis does not require that the different groups are sampled at random. Multiple-group SEM can be conducted with the lavaan package (Rosseel, 2012).

Multilevel SEM requires a random sampling of groups because it assumes that the differences between organisations in terms of the intercept and slope are normally distributed around the average intercept or slope that holds for the population. For a basic introduction into multilevel analysis within professional learning, the reader is referred to Kyndt and Onghena (2014). In short, multilevel SEM combines multilevel analysis and SEM analysis. SEM analysis is not able to take the nested structure of the data into account (i.e. employees nested within organisations), while multilevel analysis is not able to examine more complex models. Multilevel analysis is comparable to regression analysis with regard to the type of relationships they investigate. In addition, multilevel analysis does not provide goodness-of-fit indices such as CFI, SRMR or RMSEA. One can only conclude that one model fits the data better in comparison with another model (Kyndt & Onghena, 2014). Within a multilevel SEM, the model is estimated while the organisational clustering is taken into account. Taking the organisational clustering into account is important because ‘if the nested structure of the data is ignored, it is more likely that statistical relations are observed in the sample that are in fact not true (type-1 error), in addition it might be that it is concluded that a relationships holds for individuals when they are actually true for groups (ecological fallacy)’ (Kyndt & Onghena, 2014, p. 339). Moreover, within multilevel SEM, predictors on the level of the organisation can be combined with predictors at the individual level. A final difference between multilevel SEM and multilevel analysis is that multilevel SEM can also predict outcomes at the organisational level, whereas in traditional multilevel analysis, the outcomes or dependent variables need to be situated at the lowest level, that is, the individual level. Sometimes this latter issue is resolved by aggregating the individual scores to the organisational level (after the within-group agreement has been checked); however, by doing this a lot of statistical power is lost as well as potentially interesting individual differences within organisations. To our knowledge it is not possible to conduct multilevel SEM with the R software, it is possible with Mplus (Muthén & Muthén, 1998–2010). More information on multilevel SEM can be found in the article of Kaplan and Elliott (1997). Within educational sciences and labour psychology, different examples of empirical studies using multilevel SEM can be found (e.g. Johnsrud, Heck, & Rosser, 2000; Mauno, Kiuru, & Kinnunen, 2011; Sebastian & Allensworth, 2012). However, we were not able to detect a specific example within the field of professional learning.

The two above-presented extensions of SEM have something in common that they add to the complexity of the model; as a consequence, both techniques usually require very large samples (many organisations and many employees within the organisations). However, partial least squares (PLS) might be an interesting alternative to SEM if the sample size requirements cannot be met. Where SEM combines factor analysis and path models, PLS combines principal component analysis and path models (Garthwaite, 1994; Goutis, 1996; Hoyle, 1999; Iacobucci, 2010). Principal component analysis does not aim at reflecting latent constructs; rather, it tries to predict the component as good as possible. Factor analysis most commonly uses the maximum likelihood estimation, which considers the variance the different observed variables have in common, while principal component analysis takes all variance into account when estimating the component (Garthwaite, 1994; Hoyle, 1999). Because the focus lies on maximising the prediction and capturing as much variance as possible from the dependent variable, PLS is better suitable for exploratory rather than confirmatory purposes (Hoyle, 1999; Iacobucci, 2010). The loadings tend to be overestimated and path coefficients underestimated (Goutis, 1996; Iacobucci, 2010). In addition, no goodness-of-fit indices are provided. Similarly to multilevel analysis, it can only be judged which model is more suitable in comparison with another model. However, it is an interesting approach when you want to explore complex models with a limited number of observations. Gegenfurtner (2013) illustrates the use of PLS in his study on the relationship between motivation to transfer, retention, transfer and attitudes.

5 Discussion

Throughout this chapter we have tried to introduce the reader to SEM by means of an illustration within the field of professional learning. The models that are tested within this chapter are solely for the purpose of illustrating the method and should not be used for interpreting the relationships between the constructs. These results and interpretations can be found in the publication of the empirical study in which the relationships including control variables were examined (Kyndt et al., 2013). SEM offers researchers a lot of possibilities to investigate complex models. Within the paradigm of the integration of work and learning, advanced techniques such as multiple-group and multilevel SEM might be especially relevant, because these techniques allow the simultaneous examination of individual and organisational differences. These techniques are interesting if the goal is to investigate professional learning conceptualised as a reciprocal interaction between individual and organisation (Tynjälä, 2008).

However, despite all the possibilities SEM offers, it also has its limitations and possible pitfalls. The two most evident limitations are the necessity for a strong theoretical basis and the large samples that are needed especially when comprehensive models with many variables need to be estimated. Although simple models can be estimated with moderate to small samples, the added value of SEM foremost lies in estimating complex models (Hox & Bechger, 1998).

One of the most common pitfalls of SEM is that many researchers are tempted to interpret SEM models as causal models due to the impression the direction of the paths give. However, merely applying SEM to your data does not provide proof for the causality of the relationship (Bollen & Pearl, 2013). A SEM model can however raise doubts about a causal theory; when the SEM model is correctly specified and the covariance structure does not support the theoretical causal structure, it seems less plausible that the causal relationships exists. One can say that establishing covariance or correlations between the variables is a necessary but not sufficient condition for establishing causality. To be able to prove the causality of the relationships empirically, data that allow this type of conclusions need to be collected (e.g. longitudinal or experimental data). If SEM is applied to correlational data, the SEM model cannot be interpreted as a causal model (Iacobucci, 2009).

The aim of the current book chapter was to introduce the reader with the possibilities that SEM can offer within the field of research on professional learning. Within the paradigm of the integration of work and learning, the possibility to analyse more complex models can contribute to the theory development and further understanding of how learning and working are intertwined.