Teachers’ beliefs about teaching and learning are an important aspect of teachers’ professional competence (Fives and Buehl 2012), as they influence their instructional behaviour and consequently their students’ achievement (Staub and Stern 2002). Although such beliefs are relatively stable, they are modifiable through interventions (e.g. Barlow and Cates 2006) that involve reflection on beliefs and instructional behaviour as well as processing information in a systematic way (Gregoire 2003; Philipp 2007). Several scholars have investigated the influence of university education on student teachers’ beliefs and the influence of professional development programmes on the beliefs of in-service teachers (e.g. Tsai 2006). These studies show that constructivist-oriented programmes are effective in engaging teachers and student teachers in examining and changing their beliefs (for a review see Fives and Buehl 2012; Richardson 1996). However, the impact that induction programmes have on the beliefs of beginning teachers is rarely investigated (Wang et al. 2008). In our study, we examine the relationship between discourse used during induction programmes and beginning teachers’ reflection and beliefs about teaching and learning mathematics. Reflection—a prerequisite to changing beliefs—can be induced for example through cognitive conflicts which occur when dealing with contrasting points of view (Posner et al. 1982). We investigate how the discourse during seminars in the induction phase should be designed in order to induce cognitive conflicts and reflection in teacher candidates and ultimately lead to change in their beliefs. Our study is embedded in the context of teaching and learning mathematics. Teachers’ beliefs in this domain have been addressed in several studies, which give us the opportunity to build upon this research (e.g. Handal 2003).

Beliefs about teaching and learning mathematics

Teachers’ beliefs can be described as “psychologically held understanding, premises, or propositions about the world that are felt to be true” (Richardson 1996, p. 103). Calderhead (1996) considered qualitative differences in teachers’ beliefs and distinguished five areas in which teachers could hold beliefs: learners and learning, teaching, subjects, learning to teach, and teachers’ self and the teaching role. These areas may be interconnected, meaning teachers can hold related beliefs about teaching and learning or teaching and a subject (Calderhead 1996). Empirically, beliefs about teaching and learning often coincide (e.g. Voss et al. 2013). Therefore, in our study, we focus on teacher candidates’ beliefs about teaching and learning together.

In the context of mathematics, teachers’ beliefs about teaching and learning are described by several authors (e.g. Handal 2003) using two perspectives: transmissive and constructivist. From the transmissive perspective, it is assumed that teachers impart mathematical knowledge and skills to their students who are rather passive receivers of information (Perry et al. 1999; Prawat 1992). On the other hand, the constructivist perspective mirrors constructivist learning theories. Here, students actively construct knowledge on the basis of prior knowledge and beliefs (Putnam and Borko 1997). This process takes place in an interaction between students and teachers and can be best induced by real-life or authentic situations (Loyens and Gijbels 2008). Constructivist theories are theories of learning (Bransford et al. 2000). However, it is possible to deduce suggestions for teaching (Loyens and Gijbels 2008). According to our understanding, teaching can be described as constructivist only if it takes into account a constructivist understanding of learning. Thus, a key component of constructivist teaching is to induce appropriate cognitive processing and to take students’ prior knowledge into account (Mayer 2009).

Constructivist and transmissive understandings of teaching and learning are often described as diametrically opposed (Staub and Stern 2002). Accordingly, in several studies, teachers’ beliefs are conceptualized in a clear dichotomy of transmissive vs. constructivist beliefs, and beliefs are assessed on one scale with the endpoints “constructivist” and “transmissive” (e.g. Staub and Stern 2002). However, it is possible that teachers can hold both views (Voss et al. 2013). Therefore, we do not intend to conceptualize beliefs in a clear dichotomy. Instead, we use two scales to assess both beliefs dimensions. This means that from our point of view, constructivist and transmissive beliefs are not mutually exclusive; instead, a variety of combinations are possible: Teacher candidates could be high in one and low in the other but also high in both or low in both.

Teachers’ transmissive and constructivist beliefs influence their behaviour in the classroom and consequently their students’ achievement (Peterson et al. 1989; Staub and Stern 2002). Staub and Stern (2002) and Peterson et al. (1989) found that mathematics teachers with stronger constructivist beliefs made more frequent use of cognitively challenging tasks whose solutions required conceptual understanding. The results of these studies also showed an association between a more constructivist orientation and higher student achievement.

Overall, there are various beliefs about teaching and learning and how they function. There is a rather intuitive and simple understanding of learning as a transmitter-receiver model, which seems to be unfavourable in terms of the effects it produces on teachers’ instructional behaviour and consequently on their students’ learning. In addition, there is a more complex constructivist understanding which seems to be more favourable in terms of its effects. In teacher education, concepts of constructivist teaching and learning are widespread and there is a trend to foster a constructivist understanding of learning. Thus, many teacher educators attempt to challenge the transmissive understanding student teachers have when they start their studies (Lunenberg 2010).

Teacher learning and changes in teachers’ beliefs

To describe and explain how teachers learn and how they change their beliefs, researchers often draw on learning theories which are discussed in the context of student learning, namely constructivist learning theories, conceptual change theories and models of provision and uptake of learning opportunities (Kunter et al. 2013; Patrick and Pintrich 2001; Putnam and Borko 2000).

Patrick and Pintrich (2001) applied the conceptual change model of Posner et al. (1982) to teacher learning, knowing that pre-service teachers start their studies similar to school students with previously held concepts. They assumed that pre-service teachers could change their beliefs when they learned the limits of their beliefs and became aware that their beliefs were unreasonable. As a result, pre-service teachers would become dissatisfied enough to reflect on and differentiate them and ultimately change them.

Equally, the model of the determinants and consequences of teachers’ professional competence (Kunter et al. 2013) emphasizes the active role of learners, meaning that teachers’ learning depends on their uptake of learning opportunities (Fend 1981; Helmke 2009). This uptake may be expressed in the choice of learning opportunities as well as in the intensity of cognitive processing of the chosen learning opportunity. The model assumes that teachers’ personal prerequisites such as cognitive skills, motivation and personality influence their uptake of learning opportunities which in turn determines the development of teachers’ beliefs (Kunter et al. 2013).

Learning in a constructivist environment which demands an active role of learners is challenging, especially for those with unfavourable preconditions. Various individual and contextual factors can influence success in learning. First, in terms of individual factors, learners’ prior knowledge plays an important role in learning, as it forms a basis for new knowledge (Hashweh 2003). Learners with unfavourable preconditions, such as little prior knowledge, can be over-challenged in a constructivist learning environment and may experience disorientation and mental overload (Kirschner et al. 2006).

Second, learners’ motivation is seen as a prerequisite to learning (Palmer 2005; Pintrich et al. 1993). Pintrich et al. (1993) extended the purely cognitive approach of conceptual change to motivational and affective factors. They assumed that learners’ goals, values, self-efficacy and control beliefs might influence the process of conceptual change and that the depth of processing would be related in part to these motivational factors. Palmer (2005) stated that the active construction of knowledge required learners’ effort and that, therefore, motivation was required, meaning students would not make an effort unless they were motivated to do so. Motivation can be divided into a situational motivational component and trait-like motivational orientations which are relatively stable but nevertheless modifiable (Eccles and Wigfield 2002). Below, we will focus on the latter.

In this context, learners’ self-efficacy is an important motivational construct. According to Bandura (1997), individuals with a high level of self-efficacy work harder and are more persistent in accomplishing a task. Also, teachers with a higher level of self-efficacy were more motivated to engage in informal learning activities (Lohman 2006).

Not only learners’ positive appraisal of their ability is essential for engagement in an activity but also their intrinsic motivation (Pintrich 2003). In connection with teachers, intrinsic motivational orientations are conceptualized in different ways (e.g. Carbonneau et al. 2008; Long and Hoy 2006). This study focuses on enthusiasm defined as a relatively stable affective disposition which can be related to a topic or an activity (Kunter et al. 2011). An activity-related intrinsic orientation is important for adaptive and persistent learning (Pintrich 2003).

Third, learners’ emotional constitution can also be discussed as an influencing factor in their learning. In her model, Kwakman (2003) presumed that personal factors, for example emotional exhaustion, task factors and work environment, influence the participation in professional learning activities. In connection with teacher candidates, emotional exhaustion is an important factor to consider, as beginning teachers often experience the initial years of teaching as stressful (Goddard et al. 2006). In addition, high emotional stress can limit cognitive capacity and learning ability (Feuerhahn et al. 2013).

One model which integrates these assumptions is Gregoire’s (2003) cognitive-affective model of conceptual change. In this model, the role of emotion, motivation and cognition in predicting teachers’ conceptual change processes is taken into consideration. The basic principle of the model it that a lasting change in beliefs can occur only when a message, for example a new understanding of teaching, is processed in a systematic way rather than in a heuristic way. When a teacher processes the message systematically, she/he will process it deliberately and with high cognitive investment. This processing is influenced by the appraisal of the situation in which the teacher is confronted with the message and by his or her resources, such as knowledge, motivation, time and social support. This model was tested in some qualitative and quantitative studies and its main assumptions were confirmed (Ebert and Crippen 2010; Gill et al. 2004).

Teachers’ reflection on their instructional behaviour can be seen as one kind of systematic processing and is considered the critical factor for helping teachers change their beliefs and practices (Philipp 2007). Through reflection, teachers may see things differently than before and may pay attention to previously unnoticed issues. Through seeing things differently, teachers challenge their existing beliefs, which leads to associated changes in beliefs (Philipp 2007).

Fostering reflection and changes in beliefs during teacher education

According to constructivist learning theories, learning is not an individual process, it always is embedded in a social context. Putnam and Borko (1997) stated that what we take as knowledge and how we think are the products of the interactions of groups of people over time. In this context, groups of learners who share ways of thinking and communicating are called discourse communities (Putnam and Borko 2000). Teacher discourse communities may provide an opportunity to adopt a critical and reflective perspective on their teaching (Putnam and Borko 1997, 2000).

Teacher candidates in induction programmes can be seen as one example of a discourse community. Such programmes offer learning opportunities for beginning teachers transitioning from university studies to practice. One goal of teacher induction programmes is to develop reflective practitioners (Schön 1987; Zeichner and Liston 1996). Because teachers do not develop the ability to reflect spontaneously, student teachers must be introduced to reflection during their education (Hatton and Smith 1995). Wood and Stanulis (2009) describe nine aspects that effective induction programmes should include. Among them were mentoring novice teachers and encouraging reflective inquiry and teaching practices. However, the extent to which reflection is promoted by mentors during teacher induction programmes varies among programmes (Young 2007).

Overall, teacher candidates’ reflection can be influenced by various aspects of induction programmes, for example by their mentors at school or by specific induction courses (Young 2007). In terms of contextual factors, this study focuses on the effects of various kinds of discourse during induction classes. In our study, we consider two dimensions of discourse quality on the basis of the aforementioned theories. The first dimension is called “discussing different points of view” and describes the extent to which teacher educators stimulate teacher candidates to think systematically, give impetus to deeper examination and introduce cognitive conflicts in teacher candidates, for example through discussing contrasting points of view. The second dimension “sharing experiences only” describes the degree to which teacher candidates exchange their experiences and confirm each other’s views rather than discussing controversial issues. Instead of cognitive conflicts and intense discussion, reinforcement of prior beliefs takes place.

The present study

In our study, we first investigate how the quality of discourse during teacher induction programmes is related to teacher candidates’ reflection practices and their individual changes in beliefs about teaching and learning mathematics. Second, we investigate how the quality of discourse as a contextual variable and teacher candidates’ individual motivational-affective resources interact in the development of teacher candidates’ reflection.

Our study was conducted in Germany, where an induction programme is mandatory for beginning teachers and teacher education consists of two consecutive phases (Cortina and Thames 2013). In the first phase, student teachers attend university courses in two subjects, as well as in didactics and educational psychology (Terhart 2003). The programme takes 3.5 to 4.5 years and is theory- and knowledge-oriented. The second phase is a compulsory practice-oriented induction phase that takes 1.5 to 2 years (Terhart 2003). During the induction phase, teacher candidates are placed at schools where they are gradually introduced to teaching under the supervision of a mentor. In addition, teacher candidates attend weekly courses in general principles and methods of teaching and subject-specific methods at state-run teacher education institutes at the school district level (Cortina and Thames 2013). In Germany, these teacher education institutes are called “seminars” and courses are conducted by teacher educators who also work as regular school teachers. Here, teacher candidates can deepen their pedagogical content knowledge, acquire practical teaching skills and exchange views and experiences (Richter et al. 2011). An important aspect of the second phase is teacher candidates’ theory-based reflection on their own instruction. For that, teacher candidates contribute their own learning experience, which is then reviewed during the course (Lenhard 2004).

Hypotheses

As mentioned above, reflection is a prerequisite to changing beliefs (Gregoire 2003; Philipp 2007). Through reflection of their own teaching, teacher candidates may realize that learning is a complex process involving more than merely transmitting knowledge and thus incorporate a constructivist understanding of learning. Constructivist teaching in mathematics is state-of-the-art and recommended by national committees (National Council of Teachers of Mathematics 2013). In addition, teacher educators prefer constructivist beliefs whereas they are neutral to transmissive beliefs (Felbrich et al. 2008).

  1. Hypothesis 1a

    Teacher candidates’ reflection will increase their constructivist beliefs and decrease their transmissive beliefs.

Concerning the different qualities of discourse, we assume that discourse in the learning communities—in which different points of view are juxtaposed to elicit discussion—will predict teacher candidates’ reflection on their teaching. On the other hand, discourse—in which experiences are only shared but in which different views are not juxtaposed—will not be related to teacher candidates’ reflection. During induction classes, both kinds of discourse can occur so that we assessed the degree of each in all induction classes surveyed. Please note that we did not categorize the induction classes as belonging to the first or second dimension but that we assessed the degree of each in all induction classes. This means that a seminar could be rated high or low on both dimensions or low on the one dimension and high on the other.

  1. Hypothesis 1b

    Discussing different points of view in the learning community positively predicts teacher candidates’ reflection, whereas sharing experiences only has no effect.

In a subsequent step, hypotheses 1a and 1b are combined in a mediation model.

  1. Hypothesis 1c

    Teacher candidates’ reflection mediates the relationship between the extent of discussing different points of view and teacher candidates’ belief change. No mediation effect is likely to be found for sharing experiences only.

Our second research question refers to the relevance of teacher candidates’ individual motivation-affective resources to the relationship between discourse quality and reflection. Embedded in the model of Kunter et al. (2013), discourse can be interpreted as a learning opportunity and reflection as an uptake of this learning opportunity. According to the model, this uptake or reflection, respectively, is influenced by teacher candidates’ individual prerequisites, such as motivation. This means that the higher teacher candidates’ motivational-affective resources, the more they can use the learning opportunity. We follow the argumentation by Kirschner et al. (2006) who assume that the high effort of engaging in a constructivist learning activity (as provided by the discussion of different points of view) may be more easily accomplished by learners with more advantageous prerequisites.

In this study, we focus on self-efficacy, enthusiasm and emotional exhaustion, assuming that the potential relationship between discussing different points of view and teacher candidates’ reflection is moderated by these resources.

  1. Hypothesis 2a

    The relationship between discussing different points of view and reflection is stronger for teacher candidates with a high level of self-efficacy than for those with a low level of self-efficacy.

  2. Hypothesis 2b

    The relationship between discussing different points of view and reflection is stronger for teacher candidates with a high level of enthusiasm than for those with a low level of enthusiasm.

  3. Hypothesis 2c

    The relationship between discussing different points of view and reflection is stronger for teacher candidates with a low level of emotional exhaustion than for those with a high level of emotional exhaustion.

Method

Design

This study was part of a larger project which focused on the development of German secondary school mathematics teacher candidates’ professional competence during their induction phase. It is a study with repeated measurements and two cohorts. The first cohort consisted of teacher candidates at the beginning of their 2-year induction phase who had no teaching experience. The second consisted of teacher candidates at the beginning of their second year of induction who had already started to teach independently. Both cohorts were assessed at the beginning of the school year and again at the end of the school year. The teacher candidates participated voluntarily and received remuneration. Teacher candidates were clustered within seminars which made it possible to investigate the effects of the seminar discourse quality in a multilevel design.

Sample

Our sample consisted of 536 secondary school mathematics teacher candidates from both cohorts who belonged to 100 different seminars and participated at both measurement points. On average, five teacher candidates were assessed per seminar (min = 1, max = 81, median = 3). Teacher candidates’ (66 % female) mean age at the time of the first data collection was M = 27.5 years (SD = 3.8, min = 23, max = 54, median = 27). Teacher candidates in both cohorts did not differ significantly in demographic and educational background (gender, final examination school grade, desired teaching qualification: academic track versus non-academic track, highest parental score on the International Socio-Economic Index of Occupational Status) except (as to be expected) in age.

Measures

Teacher candidates’ beliefs were recorded on a paper-and-pencil questionnaire at the first and the second measuring point (MP1 and MP2). Their reflection, evaluation of the discourse quality and their resources were recorded at MP2 also on a questionnaire. Table 1 provides an overview of the descriptive data of all study variables.

Table 1 Overview of measures and descriptive statistics

Discourse quality

Two scales were developed to measure different qualities of discourse employed during the induction courses. The scale named “discussing different points of view”, which consisted of five items (e.g. “In our seminar we are encouraged to also discuss conflicting views among ourselves.”), tapped the degree to which active processing of different points of view was encouraged. The other scale, “sharing experiences only”, which consisted of five items (e.g. “In our seminar, we regularly speak about what we have experienced recently during our own teaching at school.”), described situations in which although discussion was generally encouraged, this did not explicitly include a juxtaposing of different views. Responses on both scales were given on a 6-point Likert scale, ranging from 1 (disagree) to 6 (agree).

The quality of discourse employed during seminars is a seminar-level construct which was assessed at the individual level. To investigate whether it is feasible to aggregate individual answers to a seminar mean, we computed the between-seminar variance (ICC1), the reliability indices of the seminar means (ICC2) and the agreement within seminars (ADM).

The intra-class correlations (ICC1) of 0.25 (discussing different points of view) and 0.34 (sharing experiences only) indicate that ratings of the teacher candidates systematically varied among seminars. Compared to other multilevel studies in instructional research, these ICC1 values are large (Marsh et al. 2012). These large values are not only an important prerequisite for the following multilevel analyses but are also a relevant result in their own right. It shows that 25 and 34 %, respectively, of the variety in teacher candidates’ perception of the discourse can be attributed to the seminar.

The ICC2 for “sharing experiences only” (0.73) was higher than the critical limit of 0.70, which indicates good reliability (Lüdtke et al. 2006). The ICC2 for “discussing different points of view” (0.64) was only slightly below the limit, so it still represents an acceptable reliability.

We calculated the average deviation index ADM per seminar which is an indicator for within-group agreement (Burke et al. 1999). The mean ADM values of “discussing different points of view” (0.76) and “sharing experiences only” (0.78) are both smaller than the threshold of 1 for a 6-point scale proposed by Burke et al. (1999). This indicates significant rater agreement. Thus, it is feasible to consider quality of discourse as a reliable level 2 construct. Through using a level 2 construct, the seminar conditions were assessed in the way they were seen by all participants, so that the results would not be distorted by individual perception tendencies (Marsh et al. 2012).

Reflection

To measure teacher candidates’ degree of reflection and their attempts to try new teaching methods based on their experience, the “reflection” scale, developed within the project, was used. The scale consisted of four items (e.g. “In our seminar, we regularly speak about what we have experienced recently during our own teaching at school.”, α = 0.69). Teacher candidates responded to these statements on a 6-point Likert scale, ranging from 1 (disagree) to 6 (agree). The ICC1 figure of 0.05 indicates that the seminars were comparable in terms of their composition.

Beliefs

To assess teacher candidates’ beliefs, we used the two scales “constructivist beliefs about teaching and learning mathematics” and “transmissive beliefs about teaching and learning mathematics” of a teacher belief inventory validated in previous studies (e.g. Voss et al. 2013). Each scale consisted of 10 items (transmissive: e.g. “Students learn mathematics best by watching their teacher do example problems.”, α MP1 = 0.78, α MP2 = 0.83; constructivist: e.g. “In mathematics, teaching goals are achieved best if students find their own methods for solving problems.”, α MP1 = 0.79, α MP2 = 0.82). The answers on both belief subscales and on the three scales in the next section were given on a 4-point Likert scale, ranging from 1 (disagree) to 4 (agree).

Resources

Self-efficacy beliefs

Teacher candidates’ self-efficacy beliefs in everyday school life were measured using a well-established German scale developed by Schwarzer and Schmitz (1999) which had been validated in previous studies (Schmitz 2001). It consisted of 10 items (e.g., “I am convinced that I am able to successfully teach all relevant subject content to even the most difficult students.”, α = 0.77).

Enthusiasm for teaching

To measure teacher candidates’ enthusiasm for teaching, a scale by Kunter et al. (2011) consisting of six items was used (e.g., “I really enjoy teaching.”, α = 0.84).

Emotional exhaustion

Teacher candidates’ emotional exhaustion was assessed using a four-item scale (e.g., “I often feel exhausted at work.”, α = 0.82) based on the German adaptation (Enzmann and Kleiber 1989) of the Maslach Burnout Inventory (Maslach et al. 1996). This scale has been employed in previous studies in education (e.g. Klusmann et al. 2008). The ICCs of teacher candidates’ resources (self-efficacy beliefs: ICC1 = 0.01, enthusiasm: ICC1 = 0.02, emotional exhaustion: ICC1 = 0.05) indicate that seminars were comparable in terms of their composition.

Statistical analysis

As the teacher candidates’ ratings were collected from various seminars, the data have a hierarchical structure, with teacher candidates nested in seminars. Therefore, we used a two-level analysis approach in which the data of teacher candidates were evaluated simultaneously on the individual level and on the seminar level.

Regression analyses were calculated for the first block of hypotheses. First, the effect of teacher candidates’ reflection on their individual changes in beliefs—both modelled on level 1—was computed. To control for prior differences in beliefs, beliefs assessed at MP1 were included in the model. Next, the effect of both discourse quality scales—modelled on level 2—on teacher candidates’ reflection—modelled on level 1—was investigated.

Finally, the prior analyses were combined in a multilevel mediation model (Krull and MacKinnon 2001). Following recommendations by Preacher (2011), we used a 2-1-1 model with discourse quality as an independent variable on level 2, reflection as a mediator on level 1, and beliefs as dependent variables on level 1.

For the second block of hypotheses, multilevel regression analyses with cross-level interactions were computed. The seminar variable “discussing different points of view” was modelled on level 2. Teacher candidates’ resources as a moderator and their reflection were modelled on level 1. Resources were centered at the group mean, while “discussing different points of view” was centered at the grand mean (Enders and Tofighi 2007).

To calculate practical effect sizes, the following formula was employed (Reyes et al. 2012): \( \delta =\frac{\gamma}{\sqrt{\tau_{00}+{\sigma}^2}} \). Here, γ is the association between predictor and outcome variable, and τ 00 and σ 2 are the between- and within-group variances of the outcome variable from the unconditional model, respectively. According to Reyes et al. (2012), δ can be interpreted similarly to Cohen’s (1988) d, 0.2 is small, 0.5 is moderate and 0.8 is large.

For all analyses, the statistical software package Mplus (Muthén and Muthén 1998–2010) was used. Because a small amount of data was missing (<1.5 %, see Table 1), the full information maximum likelihood (FIML) algorithms procedure (Enders 2001) was employed. Before computation, all variables were z-standardized. All significance testing was undertaken at the 5 % level.

Results

Tables 1 and 2 show the descriptive statistics and the intercorrelations on the individual level and the seminar level among all scales used in this study.

Table 2 Intercorrelations between scales on the individual level and seminar level

Teacher candidates’ reflection and changes in their beliefs (hypothesis 1a)

Multiple regression analyses on the individual level confirmed the first hypothesis (see Table 3). The more teacher candidates reflected on their instruction, the more they endorsed constructivist beliefs and the less they endorsed transmissive beliefs. The corresponding effect sizes can be interpreted as small.

Table 3 Multiple regression analyses of teacher candidates’ reflection on their beliefs at time 2 controlled for their beliefs at time 1 (level 1 only)

Relationship between discourse quality and teacher candidates’ reflection (hypothesis 1b)

A multilevel regression analysis revealed that in seminars in which different points of view were discussed, teacher candidates reported a higher level of reflection (β = 0.62, SE = 0.15, δ = 0.62, p < 0.05), indicating a moderately high effect. As assumed, sharing experiences only did not significantly affect teacher candidates’ reflection (β = −0.25, SE = 0.13, δ = −0.25, p > 0.05). In total, 80 % of the variance of reflection on the seminar level was explained by both discourse quality scales.

Relationship between discourse quality and teacher candidates’ belief change mediated by their reflection (hypothesis 1c)

The multilevel mediation analysis (see Table 4) showed that the relationship between discussing different points of view and teacher candidates’ change in constructivist and transmissive beliefs was mediated by their reflection (significant indirect effects: constructivist beliefs 0.10, transmissive beliefs −0.05). As expected, sharing experiences only did not predict teacher candidates’ reflection. In addition, no significant indirect effects were found.

Table 4 Multilevel mediation analysis: influence of discourse quality on teacher candidates’ belief change mediated by teacher candidates’ reflection

The mediation model was compared to a black box model in which the relationship between discourse quality and teacher candidates’ change in beliefs was modelled without the mediator reflection. To compare both models, a χ 2 difference test (Bollen 1989) was computed. We examined whether the additional restriction in the black box model (mediation effect was set to zero) in contrast to the mediation model, in which the mediation effect was estimated, would lead to a significant decline of the model fit. A significant difference in the model fit was found (χ 2 = 11.38, Δdf = 2, p < 0.05). Thus, the model fit of the black box model was significantly worse. Overall, hypothesis 1c could be confirmed.

Moderator effect of teacher candidates’ motivational-affective resources (hypotheses 2a–c)

The second block of hypotheses assumed that teacher candidates’ motivational-affective resources, namely self-efficacy, enthusiasm and emotional exhaustion, would moderate the association between discussing different points of view during seminars and teacher candidates’ reflection. We examined only the relevance of teacher candidates’ resources concerning discussing different points of view as we assumed that this discourse would be related to teacher candidates’ reflection, whereas sharing experiences only would not be related. This presupposition was confirmed by the results above.

To investigate the moderator effects for each resource, one multilevel regression analysis with cross-level interaction was computed (see Table 5). We found significant interaction effects for two of the three resources, namely enthusiasm and emotional exhaustion. The comparison of the graphs’ endpoints for high and low levels of enthusiasm in Fig. 1a shows that the difference in reflection among the seminars is larger for seminars showing low levels of discussing different points of view. Equally, the graphs of Fig. 1b show a larger impact of the seminar on reflection for teacher candidates with a high level of emotional exhaustion, in other words few resources. The effect sizes for the main effects can be interpreted as moderate and those for the interactions as small. For self-efficacy, no cross-level interaction was found; however, the tendency in an interaction plot points in the same direction as the effects for enthusiasm.

Table 5 Testing a moderator effect of teacher candidates’ resources: multilevel regression analyses of resources and discourse quality on teacher candidates’ reflection
Fig. 1
figure 1

Cross-level interaction between seminar discourse and enthusiasm (a) and seminar discourse and emotional exhaustion (b)

Altogether, a moderator effect of enthusiasm and emotional exhaustion could be found. However, the moderator effects were contrary to our assumptions, indicating that especially teacher candidates with fewer (rather than more) resources benefited from discussing different points of view.

Discussion

As a first result, we found a positive association between reflection and belief change that is congruent with theoretical assumptions and previous studies which shows that reflection or systematic processing is a prerequisite to changing beliefs (Gill et al. 2004; Gregoire 2003; Philipp 2007). Second, teacher candidates’ reflection was predicted by discourse used during the seminar, however, albeit depending on the kind of discussion held. In seminars where different points of view were discussed, teacher candidates showed a higher level of reflection, which in turn predicted their belief change. The direct effect of discussing different points of view on teacher candidates’ beliefs was not significant. Even so, a mediated effect can exist whether or not there is a significant effect of the independent variable on the dependent variable (MacKinnon 2008). In contrast, it seems that sharing experiences in a seminar only on a superficial basis does not induce teacher candidates to reflect on their instruction. If teacher candidates only share their experiences, their existing beliefs may act as a filter and influence the perception of their peers’ experiences. As a result, teacher candidates may consider only experiences which confirm their own beliefs, a phenomenon that is known as “confirmation bias” (Nickerson 1998).

Third, referring to the relevance of teacher candidates’ motivational-affective resources, the data confirmed a moderator effect for enthusiasm and emotional exhaustion, though in an opposite direction to our assumptions. This means that especially teacher candidates with few resources benefited more from a conflict-inducing seminar culture than did their peers. It seems the lack of individual resources could be compensated by the guidance of the teacher educator. So far, it was assumed that teacher candidates had to show a high level of motivation for the uptake of learning opportunities (Kunter et al. 2013; Pintrich et al. 1993). According to the observed effects, this does not necessarily seem to be the case. This can be interpreted as an empirical argument against the assumption that constructivist learning environments overstrain learners with unfavourable preconditions (Kirschner et al. 2006). The association between discussing different points of view and reflection was also positive for teacher candidates with many resources, but not to such a high extent as for teacher candidates with few resources.

For self-efficacy, no moderator effect could be found. This does not support our assumption that high self-efficacy is a positive resource that may enhance candidates’ readiness to engage in learning activities (Lohman 2006). However, there may be other than linear associations between self-efficacy and learning engagement. For instance, Wheatley (2002) has argued that very high efficacy beliefs may even hinder teachers’ willingness to engage in learning or change. Teacher candidates with high levels of self-efficacy may feel confident referring to their knowledge and skills and thus do not consider suggestions from the discourse and see no need for reflection or change.

Strengths and limitations of the study

One strength of our study is that it makes an empirically-based contribution to the field of teacher learning which is rather limited (Brouwer 2010). The study adds to the research on beginning teachers’ professional development by examining the cognitive processes underlying these changes, which so far have been rarely studied (Desimone 2009).

Another strength is that we considered the hierarchical structure of the data, with teacher candidates nested in 100 different seminars. However, the small number of on average five teacher candidates per seminar needs to be discussed. Hox (2010) concluded on the basis of simulation studies that a large number of groups rather than a large number of individuals per group seemed to be crucial for accuracy and high power in multilevel analyses. However, simulation studies showed that relatively exact parameter estimations on the individual level with n = 5 were possible and that the estimation could be improved only through disproportional increase in sample size (Maas and Hox. 2005).

One limitation of our study is that the mediation analysis was partly based on correlative data. However, we included teacher candidates’ beliefs at a prior measuring point to control for a priori differences. To get more solid results, a complete longitudinal mediation analysis assessing all three constructs at a minimum of three measuring points is warranted (Selig and Preacher 2009).

A second limitation of our study is that discourse quality was not assessed directly, but rather from the perspective of teacher candidates. However, through aggregation of the self-reports per seminar, the perception of discourse quality became a true level 2 construct which was not distorted by individual perception tendencies (Marsh et al. 2012). Nevertheless, more information on the discourse during the seminar would be desirable. Further studies should investigate discourse and its effects on reflection in more detail. Instead of self-reports which could contain memory effects, video recordings could be an alternative measure to reduce the bias.

A third limitation was the relatively small effects of the data. The data showed that seminars differed in their discourse quality. However, the mediation analysis showed small indirect effects for discussing different points of view. The effect sizes of the interaction between teacher candidates’ resources and discourse quality were small, whereas the main effects of discourse quality and resources on reflection can be interpreted as moderate. Thus, they can be interpreted only as indices for the effect of individual resources.

A fourth limitation is that we cannot rule out a selection bias as the study only included voluntary participants. In addition, this study only dealt with the design of seminars during the induction phase as one part of teacher education. Certainly, there are other factors on different levels that may influence teacher candidates’ beliefs. On the one hand, other teacher candidates’ personal characteristics, such as knowledge, and the immediate context of the school and the classroom may have an effect, e.g. teacher candidates’ experiences in the classroom or their mentors (Fives and Buehl 2012; Richardson 1996). On the other hand, the larger context of national policies and cultural norms and values may also influence teacher candidates’ beliefs (Woolfolk Hoy et al. 2006).

Future research

In order to get detailed empirical evidence of the interplay of learning opportunities for teacher candidates, their individual resources and their beliefs, the following aspects seem important for future studies. First, in this study, we focused only on teacher candidates’ motivational-affective resources. In future research, we will also consider other resources, such as knowledge, time and social support (Gregoire 2003). In particular, teachers’ professional knowledge is an important aspect to consider as it shapes their instructional practices and influences student learning (Depaepe et al. 2013).

Second, this study only considered the design of seminars during the induction phase as one aspect of teacher education. Future studies should widen the scope and investigate the interplay of different learning opportunities, e.g. seminars and mentoring.

Third, in this study, we considered constructivist and transmissive beliefs separately. However, further research could investigate the interplay of constructivist and transmissive beliefs. Person-centered analyses such as cluster analyses or latent profile analyses (Vermunt and Magidson 2002) may be suitable to analyze whether a specific pattern of teacher candidates’ constructivist and transmissive beliefs is beneficial for their instructional behaviour and student learning.

Theoretical and practical implications

When designing seminars for teacher candidates, several theoretical and practical impulses may be gathered from this study. Initially, the results of this study provide an indication that supports constructivist learning principles in teacher education. In addition, the results underline the important role of teacher educators in encouraging and moderating discussions in terms of scaffolding (Hardy et al. 2006). Furthermore, when designing seminars for teacher candidates, we should—according to the model of Kunter et al. (2013)—understand teacher candidates, similar to students, as learners with individual prerequisites and constraints. Only if we consider these prerequisites and constraints can we provide effective learning opportunities for them.

Some strategies to induce cognitive conflicts in teacher candidates and to facilitate the development of reflection may be of practical use, for example the method “critical friends” (Hatton and Smith 1995) and the use of video clubs (Sherin and Han 2004).

Video clubs refer to meetings in which groups of teacher candidates watch and discuss excerpts of video recordings of their classes. The advantages of videos are that they can capture the complexity and authenticity of what is happening in the classroom, they can be viewed several times and they require no immediate response to situations. This allows teacher candidates to reflect on their instruction and to analyze their decision-making processes (Sherin and Han 2004).

Altogether, Moir and Gless (2001) mentioned that high-quality induction programmes could act as a catalyst for changing school cultures and improving the teaching profession. Based on psychological learning theories, results from our study give a preliminary indication as to how these programmes may have to be shaped in order to achieve this desired effect.