Keywords

1 Introduction

Teachers substantially differ in their capability to foster student learning and progress (Nye et al. 2004). Consequently, extensive research has examined what features characterize competent teachers. These features comprise professional knowledge, beliefs, motivational orientations, and self-regulation. Professional knowledge, including content knowledge (CK), pedagogical content knowledge (PCK), and pedagogical knowledge (PK) are all considered important cognitive components of teacher competence (Baumert and Kunter 2013). Inspired by the work of Lee Shulman (1987), pedagogical content knowledge (PCK)—that is, knowledge needed to make concrete subject matter accessible to students—has become a promising construct that has been widely investigated (Depaepe et al. 2013). PCK is therefore “per definition” considered a core component of teacher competence, which has been substantiated in recent research on its impact on quality of instruction and student progress (Baumert et al. 2010; Hill et al. 2005; Sadler et al. 2013).

However, research has just started to investigate how and under which conditions teachers develop PCK (Friedrichsen et al. 2009). In the research literature, three assumptions prevail, concerning the role of prior PK and CK for the development of PCK: (1) CK and PK amalgamate, (2) CK is a necessary condition and facilitates PCK development, and (3) CK is a sufficient condition for teachers’ PCK development. From these as yet unsatisfactorily tested assumptions, strong implications for teacher education arise. In this chapter, we first elaborate on these theoretical assumptions and then present detailed information about the experimental study we conducted to test these hypotheses. This study was situated in the domain of mathematics (fractions: concept and computations). As our project was funded in the third phase of the priority program (Leutner et al. 2017, in this volume), only preliminary results can be reported. However, we present detailed information on the construction of courses and the tests of PCK, CK, and PK used in this study.

1.1 The Construct of Pedagogical Content Knowledge

Besides CK and PK, PCK is considered to be a unique domain of teacher knowledge. Although conceptualizations of PCK differ, two components are included in most PCK conceptualizations: (1) Knowledge of student understanding and learning, and (2) knowledge of teaching in a concrete content domain (Depaepe et al. 2013). The fact that these categories refer to concrete subject matter distinguishes PCK from general PK about learners, learning and teaching.

A major issue in research on PCK is the proper assessment of this knowledge. Much research has relied on distal measures for teachers’ PCK: for instance, coursework, certifications, or participation in professional development programs. However, in most studies, these measures were poor predictors of classroom practice or student learning (e.g., Kennedy et al. 2008). It was only recently that research made progress in the more direct assessment of PCK (Krauss et al. 2008; Hill et al. 2005; Sadler et al. 2013). These measures have allowed for further testing the assumption of PCK as a unique dimension of teacher knowledge. Several studies provided factor analytical evidence that PCK may indeed be considered a separate dimension in teachers’ knowledge base for teaching (Blömeke et al. 2014; Hill et al. 2004; Krauss et al. 2008). Further, recent studies show that compared to CK, for instance, PCK possesses differential and unique properties concerning the prediction of classroom practice and student learning (Baumert et al. 2010; Sadler et al. 2013). From all these results, at least two PCK conceptualizations are called into question. First, some authors judged the concept of PCK to be redundant, contained within subject-matter knowledge (McEwan and Bull 1991). Second, in the integrative model of PCK (Gess-Newsome 1999), CK, PK and context knowledge constitute unique dimensions of teacher knowledge, and PCK must be formed from these resources in the concrete and situated act of teaching. In this conception, PCK is considered an elusive cognition. Gess-Newsome (1999) contrasts this model with the transformative model, in which PCK is conceptualized as a unique knowledge category. In the present study, we follow the idea of PCK as a unique dimension in teachers’ professional knowledge.

1.2 Conditions for the Development of Pedagogical Content Knowledge: The Role of Prior Content Knowledge and Pedagogical Knowledge

Given the educational impact of PCK, the state of research on PCK development in pre- and in-service teachers is unsatisfactory (Depaepe et al. 2013; Schneider and Plasman 2011; Seymour and Lehrer 2006). Although research has started to investigate the conditions for PCK development, the factors fostering teachers’ PCK construction remain obscure (Seymour and Lehrer 2006). In the literature, a major concern is the role of teachers’ prior CK and PK as individual resources for the development of PCK (Magnusson et al. 1999; Schneider and Plasman 2011; Van Driel et al. 1998).

Again it was Shulman who substantially influenced the fundamental ideas on the formation of PCK. He claimed that PCK represents the “blending of content and pedagogy” into an amalgam he called PCK (1987, p. 8). Consequently, CK and PK were considered important individual resources for PCK development (Grossman 1990; Krauss et al. 2008; Magnusson et al. 1999). However, what is more important: Is it the amalgamation of PK and CK that constitutes PCK construction? Is the formation of PCK mainly based on teachers’ CK resources? Or are there different routes or pathways to PCK development? These questions have broad implications for teacher education and professional development.

Amalgamation of CK and PK

Concerning PCK as an amalgam of content and pedagogy, a subtle differentiation has to be made: Is it a description of the process of PCK development, or is it a description of the properties of PCK? Ball, Thames and Phelps, for instance, displayed examples of mathematical knowledge of teaching and content, one of their components of PCK. They summarized that in all these examples, PCK “is an amalgam, involving a particular mathematical idea or procedure and familiarity with pedagogical principles for teaching that particular content” (Ball et al. 2008, p. 402). In this sense, the term amalgam refers to a property of PCK, and the authors do not infer that PCK necessarily needs to be developed from PK and CK. By contrast, in their review of science teacher PCK, Schneider and Plasman state that in order to develop PCK “science teachers need an understanding of science, general pedagogy, and the context (students and schools) in which they are teaching” (2011, p. 534). From these individual resources PCK is constructed in a process of “amalgamation or transformation” (Schneider and Plasman 2011, p. 533). This notion of amalgamation—that is, the process of constructing PCK from PK and CK as individual resources—is widespread (e.g., Krauss et al. 2008; Schneider and Plasman 2011). As Gess-Newsome points out, the assumption that PCK develops through an amalgamation of CK and PK is also reflected in traditional patterns of pre-service teacher education, with spatial and temporal separation of subject matter and pedagogical issues (Gess-Newsome 1999).

CK as the Main Resource

In the literature on teacher knowledge, there is some agreement that CK represents a main resource for PCK development (e.g., Depaepe et al. 2013; Friedrichsen et al. 2009; Krauss et al. 2008; Sadler et al. 2013). This assumption is often justified with the claim that it is CK that needs to be transformed into PCK (Shulman 1987). Further, this assumption is based on the observation that pre- and in-service teachers fail to develop proper PCK when CK is missing or deficient. Several qualitative studies have found that CK constraint the scope for PCK construction: Pre- and in-service teachers themselves often have misconceptions or fragmented content knowledge that limit, for example, their knowledge of student conceptions, or their knowledge of cognitively challenging learning situations (e.g., Friedrichsen et al. 2009; Van Driel et al. 1998).

Quantitative research also provides supportive evidence for the assumption that CK is a necessary or facilitating condition for PCK development (Hill et al. 2004; Krauss et al. 2008; Sadler et al. 2013). For instance, factor analyses of CK and PCK measures show that both constructs represent unique dimensions, but are often highly correlated (Krauss et al. 2008; Hill et al. 2004). In contrast, PK seems to be more loosely associated with PCK (Voss et al. 2011). Sadler and colleagues inspected constellations in teachers’ levels of CK and PCK. In their sample of 181 secondary physics teachers, they found teachers with high CK and PCK, teachers with high CK and low PCK, but almost no teachers showing high PCK levels and low levels of CK. They inferred that CK must be a necessary condition of PCK development (Sadler et al. 2013).

CK may even be considered a sufficient condition of PCK development. On the one hand, there is some evidence that higher levels of CK are not necessarily linked with higher levels of PCK (Lee et al. 2007; Sadler et al. 2013; Schneider and Plasman 2011). However, on the other hand, studies with German mathematics teachers showed that teachers teaching at academic track schools (Gymnasien) exhibited consistently higher levels of PCK than teachers from nonacademic track schools (Baumert et al. 2010; Kleickmann et al. 2013). This result is contrary to expectations, as teachers from academic track schools received broader and deeper learning opportunities for CK, but less for PCK. Profound CK may therefore even represent a sufficient condition for PCK development. A good example of how this assumption is reflected in education is the university teaching system: University professors and lecturers are usually appointed on account of their presumed knowledge in their field of study, and it is assumed that they will be able to teach these topics thanks to their CK; explicit instruction in PCK is not deemed necessary. Another example is the practice of lateral entry into teaching. Lateral entry allows content specialists to obtain a teaching position in schools without previous participation in a teacher education program.

Multiple Pathways

Some authors have assumed that there might be multiple pathways or routes to teachers’ PCK development (Gess-Newsome 1999; Magnusson et al. 1999; Schneider and Plasman 2011). Gess-Newsome, for instance, has suggested that teachers’ PCK construction may primarily be based on or facilitated by teachers’ CK resources, but, when CK is deficient, teachers may rely on their PK (1999). This assumption was also proposed by Krauss et al. (2008). In a sample of biology and chemistry physics teachers, they found low levels of mathematical CK, but comparably high levels of PCK. The authors suggested that these teachers may have drawn on their general PK when constructing PCK. However, this notion is challenged by results from a quasi-experimental field study by Strawhecker (2005). She found that a method course for pre-service mathematics teachers addressing general PK did not substantially contribute to PCK development (Strawhecker 2005).

1.3 The Present Study

The present study was concerned with the role of prior PK and CK as individual resources for the development of teachers’ PCK. In teacher education, the balancing of learning opportunities for CK, PK and PCK is a matter of great concern (Gess-Newsome 1999; Strawhecker 2005). Providing evidence on the role of prior CK and PK for the development of PCK is therefore an important issue for educational research, as it may inform this debate.

In previous research on the role of CK and PK for PCK development, three main assumptions may be differentiated: (1) teachers construct PCK from their prior CK and PK in a process of amalgamation, (2) CK is a necessary condition and facilitates PCK development, and (3) CK is sufficient for teachers’ PCK development. Finally, some authors suggest that there might be multiple pathways to PCK development. Up to now, these assumptions have mainly been based on case studies, with some of them including longitudinal designs and/or cross-sectional field studies. Quasi-experimental studies are rare and, as far as we know, no experimental studies have yet been conducted. Thus, causal inferences on the validity of the aforementioned assumptions are not yet warranted.

In the present study, we aimed to complement existing research by a randomized controlled trial on the role of prior CK and PK for pre-service mathematics teachers’ PCK in the domain of fractions and computations with fractions. We thereby aimed at providing causal evidence on the validity of the aforementioned assumptions concerning the development of PCK. To this end, we experimentally manipulated pre-service teachers’ CK, PK, and PCK and inspected effects on their PCK development. We chose the domain of fractions as it is well researched with regard to student conceptions and instructional strategies fostering student understanding.

The focus of this chapter is (1) to describe the treatments implemented to experimentally manipulate participants’ professional knowledge, (2) to introduce our measures of PCK, CK, and PK, and (3) to present findings on the quality of our measures, as well as to provide a summary of preliminary results of tests of the three aforementioned assumptions.

2 Methods

Participants attended intensive two-day workshops featuring various combinations of lessons on CK, PCK and PK that are potentially relevant for teaching fractions and fractional arithmetic in sixth-grade mathematics. The experimental design featured three experimental and two control groups. Each experimental group was devised to represent one hypothesis about the development of PCK. The experimental group representing the amalgamation hypothesis received lessons on CK on the first day and lessons on PK on the second day (EG amalg). The experimental group representing the hypothesis that CK is a necessary condition and facilitates PCK development received lessons on CK on the first day and lessons on PCK on the second day (EG facil). The experimental group representing the hypothesis that CK is sufficient for the development of PCK received lessons on CK on both days (EG suffi). The control groups were further divided into a weak and a strong control group; participants in the weak control group received only instruction on PK (CG weak), while participants in the strong control group received only instruction on PCK (CG strong).

The experimental design contained four measurement occasions: a pretest at the beginning of the first workshop day, an intermediate test at the beginning of the second day, a posttest at the end of the second day and a follow-up test approximately 6 weeks after the workshops. The current chapter reports on the first three measurement occasions (see Fig. 8.1).

Fig. 8.1
figure 1

Experimental design with groups, tests, and measurement occasions. Additional covariates not included

Relations between PCK, CK, and PK depend to a great extent on the definitions of these constructs. Knowledge of classroom management, for instance, which is often included in PK definitions, should be more distal to PCK than general PK on student conceptions and conceptual change. In our study, we tried to closely attune tests and treatments on PCK, CK, and PK.

2.1 Participants

One hundred pre-service teachers who were enrolled in undergraduate programs that prepared them for teaching both at the elementary and at the lower secondary levels, participated in the study. Twelve participants were male. Participants’ ages ranged from 19 to 46 years; most participants were in their early twenties (M = 22.9 years, SD = 5.0). Ninety-five percent were in the first year of their academic studies. Participants received a payment of 200 Euro. This payment was reduced to 160 Euro where participants missed the follow-up assessment. We recruited participants from universities in Potsdam and Berlin. We randomly assigned persons from the pool of 165 applicants seeking to participate in our study, to each of the five groups of our experimental design. This procedure resulted in moderately unequal group sizes, ranging from 16 participants for CG weak to 23 participants for EG suffi.

2.2 Treatments

The two-day workshops followed a common time schedule: Each day began with a testing session (120 min on the first and 60 min on the second day), followed by a half hour break. After this, two 105-min instruction blocks followed, divided by a one hour lunch break. The end of second day additionally included a half hour break and another testing session (90 min). In sum, the two-day workshops included seven hours of treatment in the respective domains, equaling four to five regular seminar sessions of 90 min. The treatments were conducted by an experienced lecturer in elementary mathematics education who, when teaching the courses, was unaware of the precise content of the tests on PCK, CK, and PK.

The implementation of the treatments followed instructional storyline provided by lesson plans and presentation slides. Participants were equipped with corresponding handouts. Naturally, we aimed for a constant level of participant activity and involvement across treatments. Thus, treatments were interspersed with various tasks for participants, ranging from short questions to role play. Treatment blocks concluded with writing assignments prompting participants to recapitulate the major contents of the respective treatment blocks. When participants asked for information not intended by the treatment at hand—for instance, when participants during a treatment on CK asked for information on PCK—these questions were left unanswered, with a cursory reference to the rationale of the study. However, after the follow-up test, all participants were provided with the complete course material. Preliminary versions of the treatments had been piloted with a total of 100 pre-service teachers.

Both the treatment on PK and the treatment on CK, possessed specific overlap with the treatment on PCK, while they had no overlap with each other. For instance, the treatment on PK generically covered the hierarchy of enactive, iconic and symbolic representations. In contrast, the treatment on PCK introduced instructional representations for specific aspects of the area of fractions and fractional arithmetic, such as enactive and iconic representations for expanding and reducing fractions. The treatment on CK, finally, covered the topic of expanding and reducing fractions without reference to instructional representations.

In the experimental design, three of the five groups (EG suffi, CG weak and CG strong), featured repeated instruction in the same area of professional knowledge on both days of the workshops. In these groups, we devised a basic and an advanced course for each area of professional knowledge. Beyond repetition of some contents, advanced courses added further perspectives to basic courses, without extending the scope delimited by previous basic courses.

Treatment on Content Knowledge

The basic course on CK started with conveying very simple facts, such as clarification of the terms numerator, vinculum and denominator. After that, the set of positive rational numbers was constructed from the set of natural numbers as equivalence classes of simple linear equations (a = b ⋅ x, a, b ∈ N, b ≠ 0). In this context, a fraction corresponded with the desirable solution of an equation that has no solution in the set of natural numbers. Accordingly, a “new” set of numbers was constructed that is closed under division. Moreover, the equivalence of fractions representing the same rational number was highlighted. The procedures of expanding and reducing were introduced as techniques for converting equivalent fractions into each other. This concluded the first block of the basic course. The second block of the basic course was reserved for defining and exercising arithmetic operations with fractions. This included addition, multiplication and division. Participants examined these operations with respect to the definition of fractions by linear equations. The aspect of closure was discussed in this context. Moreover, participants practiced the ordering of fractions. Here, participants discovered the density of rational numbers.

The advanced course was mostly a straightforward repetition of the basic course. In particular, the first block included constructing the set of positive rational numbers from the set of natural numbers, differentiating fractions and rational numbers, expanding and reducing fractions, as well as discussing the density and the cardinality of the set of positive rational numbers. Apart from repetition, the second block featured demonstrations of the validity of the commutative, distributive and associative laws for the set of rational numbers.

Treatment on Pedagogical Knowledge

At the beginning of the basic course on PK, participants were introduced to the conception of classroom instruction as the provision of opportunities to learn; the teacher was presented as an influential orchestrator of these opportunities. Apart from that, the first block of the basic course covered general principles of learning. Participants were familiarized with the central role of student conceptions and learned about the idea of learning as conceptual change. The second block of the basic course was concerned with generic principles of teaching. Specifically, this covered tolerance for errors, the use of misunderstandings for learning, the provision of adequate scaffolding and the use of representations for fostering understanding.

The advanced course mirrored the arrangement of the basic course; the first block focused on learning, the second block on teaching. Beyond repetition, the first block expanded participants’ capabilities with respect to the diagnosis of student conceptions, for example. Similarly, the second block concentrated on structuring content and reducing complexity of content as vehicles for facilitating understanding within a repetition of the basic principles of teaching.

Treatment on Pedagogical Content Knowledge

The basic course on PCK began with a general introduction to the relevance of student conceptions and conceptual understanding for teaching mathematics. The following first block of the basic course was concerned primarily with conceptual aspects of fractions. For instance, participants were introduced to the part-whole and the operator concepts of fractions; they discussed advantages and disadvantages of these concepts with regard to several aspects of teaching fractions in elementary school. Furthermore, participants were provided with methods for explicating the density of rational numbers and the fact that a rational number can be represented by varying fractions. The second block covered the topic of teaching operations with fractions. Specifically, participants learned about strategies elementary school students might use for comparing fractions, and how to foster the flexible use of these strategies. Moreover, the second block presented information on typical errors with respect to the addition and division of fractions; participants were instructed how to introduce these operations to elementary school students—for instance, by the use of appropriate representations. The second block concluded with discussing the fundamental changes student conceptions have to undergo when transcending from the set of natural numbers to the realm of fractions.

The advanced course on PCK started with a repetition of the necessary fundamental changes in student conceptions in face of the introduction of fractions. The rest of the first block tapped teaching operations with fractions. This included multiplication and division as well as comparing fractions; participants were confronted with typical errors committed in elementary school and with different approaches for introducing these operations with fractions into the elementary school classrooms. The second block covered representations. This included enactive, iconic and symbolic representations for expanding and reducing fractions, for addition with fractions and for multiplication with fractions.

2.3 Measures

Test of Content Knowledge

Measurement of participants’ CK was based on an item pool of 27 items. For economy of assessment, the item design was incomplete. Participants completed 20, 19 and 24 items on pretest, intermediate test and posttest, respectively. A set of 11 anchor items appeared in all three assessments; 15 items were utilized on two measurement occasions, while one item was presented exclusively on a single measurement occasion. The item pool comprised 6 closed-response and 21 free-response items. The CK item pool covered the correspondence of fractions and linear equations, the conversion of fractions into decimals (and vice versa), the ordering and comparison of fractions, calculations with fractions (including word problems), and specific properties of the set of rational numbers (see Fig. 8.2 for a sample item).

Fig. 8.2
figure 2

Sample items from the tests on PCK, CK, and PK

Test of Pedagogical Knowledge

Assessment of participants’ PK was based on a pool of 40 items. In correspondence to the other measures of participants’ knowledge, items were partially rotated across measurement occasions. Particularly, participants worked on 29, 27 and 34 items on pretest, intermediate test and posttest, respectively. There were 16 anchor items appearing in all three assessments. Of the other items, 16 items were presented twice and eight items were presented once. The item pool was divided in 11 closed-response and 29 free-response items.

The PK item pool covered the relevance of student conceptions and prior knowledge for subsequent learning, the basic principles of conceptual change, the handling of errors, the role of representations, scaffolding and various methods for fostering understanding. Naturally, this categorization of items was only tentative. It was possible, for instance, to solve some items on scaffolding with knowledge about representations (see Fig. 8.2 for a sample item).

Test of Pedagogical Content Knowledge

Assessment of participants’ PCK was based on a pool of 41 items. In part, items were rotated across measurement occasions. On pretest, intermediate test, posttest and follow-up test, participants completed 36, 29, 38 and 41 items, respectively. A set of 23 anchor items was used on all measurement occasions, whereas 17 items were presented twice. One item was presented exclusively on the follow-up test. While 24 items had a closed response format, 17 items called for free responses.

The PCK item pool covered the use of enactive and iconic representations for facilitating understanding of fractions and operations with fractions, knowledge of typical errors and command of approaches for introducing the operations into the elementary school classroom, and knowledge about students’ conceptual understanding of fractions. Obviously, items regularly touched on the aforementioned item characteristics simultaneously. So, the presented classification of items is only conjectural (see Fig. 8.2 for a sample item).

2.4 Baseline Equivalence and Treatment Implementation Checks

We checked whether our random assignment procedure resulted in baseline equivalence of the three experimental and two control groups with regard to their professional knowledge and with regard to covariates, such as motivational characteristics, epistemological beliefs, and beliefs on teaching mathematics. We found only minor and insignificant group differences in the PCK, CK, and PK pretest scores, as well as in the covariates, indicating that randomization was successful.

We further checked whether our PCK, CK, and PK courses succeeded in manipulating participants’ professional knowledge as intended. An inspection of PCK, CK, and PK growth for each treatment day and each of the groups featured in our design, exhibited the desired significant gains in participants’ professional knowledge. Moreover, we videotaped all courses in order to check whether only the intended knowledge domain was taught; these analyses are not yet completed.

3 Results

In this section, we present findings on the quality of our measures of PCK, CK, and PK, and then give a short summary of preliminary results on the tests of the three assumptions on PCK formation.

3.1 Measurement of Pre-service Teachers’ Knowledge

Test of Content Knowledge

For descriptive purposes—that is, to map item content on person ability—we submitted pre-service teachers’ responses on the test of CK to a concurrent calibration of pretest, intermediate test, and posttest, according to the simple Rasch model. Item difficulties ranged from −3.26 logits to 2.60 logits. The easiest item called for the subtraction of a proper fraction from another proper fraction, whereas the most difficult item required the production of all fractions with a denominator of three between a given proper fraction and a given mixed numeral. On average, items aiming at calculations with fractions were comparatively easy (M = −0.86 logits) though, with a range from −3.26 logits to 0.83 logits, they varied considerably in difficulty (SD = 1.72 logits). Relative to this, items covering the conversion of fractions into decimals (M = −0.34 logits, SD = 0.38 logits) and items affording the comparison or ordering of fractions (M = −0.34 logits, SD = 1.43 logits) exhibited intermediate average item difficulties. Finally, items involving the expression of fractions as classes of equivalent eqs. (M = 0.66 logits, SD = 0.32 logits) and items asking for general properties of the set of rational numbers (M = 1.21 logits, SD = 0.49 logits) possessed the highest average difficulties of all items of the test of content knowledge. In addition, items featuring improper fractions or mixed numerals (M = 0.01 logits, SD = 1.63 logits) outstripped items presenting exclusively proper fractions (M = −0.79 logits, SD = 1.14 logits) in terms of average difficulty. Infit values varied between 0.83 and 1.26, indicating reasonable fit to the simple Rasch model.

For model identification, the distribution of item difficulties possessed a predefined mean of 0.00 logits (SD = 1.33). In comparison, the mean of the distribution of person ability for pretest equaled −0.65 logits (SD = 1.11). On the intermediate test, mean person ability was .19 logits (SD = 1.38). Finally, on the posttest, the mean person ability equaled 0.50 logits (SD = 1.27). In other words, on average, participants started the workshops with the ability to solve simple calculations with fractions, mastered the conversion of fractions into decimals, as well as the comparison and ordering of fractions in the intermediate test, and approached the ability to handle fractions in the form of equations at posttest. Cronbach’s alphas were .80, .80 and .84 for pretest, intermediate test and posttest, respectively. The person separation reliability for the weighted likelihood estimates of ability obtained from the concurrent calibration of the three measurement occasions was .82.

Test of Pedagogical Knowledge

A calibration following the simple Rasch model was performed on pre-service teachers’ responses on the test of PK. A range of item difficulties from −5.06 logits to 3.23 logits was obtained. The easiest item requested participants to recognize mistakes in the classroom as opportunities to learn. The most difficult item asked for brief definitions of the notions of enactive, iconic and symbolic representations. The average difficulty of items focusing on learning (M = −0.06 logits, SD = 1.31 logits) did not differ considerably from the average difficulty of items centering on teaching (M = 0.07 logits, SD = 1.72 logits). Specifically, items concerned with the proper handling of mistakes in classroom instruction constituted a relatively easy set of items with a remarkable variation in difficulty (M = −1.58 logits, SD = 3.12 logits). In comparison, knowledge about the importance of student conceptions and prior knowledge for successful learning represented a more advanced step in proficiency on the test of PK (M = −0.56 logits, SD = 1.32 logits). Items probing participants’ capabilities with respect to the concept of scaffolding, denoted even further advanced proficiency (M = 0.31 logits, SD = 1.09 logits). Finally, on average, command of the basic principles of conceptual change theory (M = 0.65 logits, SD = 0.96 logits) and of the notions of enactive, iconic and symbolic representations (M = 0.69 logits, SD = 1.70 logits) constituted the apex of proficiency in PK. Infit values ranged from 0.81–1.14, reflecting adequate fit to the simple Rasch model.

The distribution of item difficulties of the test of PK had a predefined mean of 0.00 logits (SD = 1.49). In relation to this, on the pretest the mean of the ability distribution was −1.59 logits (SD = 0.66). On the intermediate test, average person ability equaled −1.61 logits (SD = 0.78 logits). Eventually, on the posttest, the mean of the ability distribution amounted to −0.85 logits (SD = 0.85). In essence, on average, the test of PK was very difficult. Most participants mastered merely the easiest items of the test. In fact, only in eight cases did item difficulty fall below average person ability on posttest. Internal consistency, in terms of Cronbach’s alphas, was .48, .68 and .78, for pretest, intermediate test and posttest, respectively. The person separation reliability of the weighted likelihood estimates of ability was .67.

Test of Pedagogical Content Knowledge

Calibration according to the simple Rasch model based on pre-service teachers’ responses on the test of PCK for the first three measurement occasions, yielded item difficulties that varied between −3.53 logits and 2.33 logits. The easiest item was concerned with the shortcomings of introducing fractions initially via equations. On the other hand, the most difficult item afforded participants the opportunity to provide an intuitively accessible explanation for the use of the reciprocal of a fraction in division involving fractions. On average, items aiming for knowledge about elementary school students’ conceptual understanding of fractions per se, were comparatively easy to solve (M = −0.34 logits, SD = 1.48 logits). Items probing for participants’ proficiency with regard to the use of representations for fostering understanding were somewhat more difficult (M = −0.09 logits, SD = 1.31 logits). Finally, items centering on the teaching of operations constituted the set of items with highest average difficulty (M = 0.42 logits, SD = 1.07 logits). However, disparities in mean difficulty between the three tentative groups of items tended to be moderate. Infit values varied between 0.89 and 1.09 indicating excellent fit to the simple Rasch model.

The distribution of item difficulties for the test of PCK was predefined with a mean of 0.00 logits (SD = 1.30). On the pretest, the mean of the ability distribution was −0.31 logits (SD = 0.61). On the intermediate test, the average person ability was −0.01 logits (SD = 0.68). Eventually, on the posttest, the mean of the ability distribution equaled 0.31 logits (SD = 0.73). This indicates a steady increase of participants’ average ability with regard to pedagogical content knowledge, across the three measurement occasions, without floor or ceiling effects. Cronbach’s alphas amounted to .61, .60, and .72, for pretest, intermediate test and posttest, respectively. The weighted likelihood estimates of person ability displayed a separation reliability of .68.

Exploration of Dimensionality and External Validity

To assess the dimensionality of professional knowledge captured with our instruments, we submitted participants’ responses on the three tests to a unidimensional, to two two-dimensional, and to a three-dimensional calibration, according to the simple Rasch model; in each case the measurement occasions of pretest, intermediate test and posttest were calibrated concurrently. In each of the two-dimensional calibrations, two domains of professional knowledge with partially overlapping content formed a single dimension: that is, CK and PCK, or PK and PCK, were combined. Subsequent likelihood ratio tests uncovered that the three-dimensional model possessed better relative model fit than did the unidimensional model, χ 2(5) = 746.98, p < .001, the two-dimensional model featuring a combination of CK and PCK, χ 2(3) = 285.94, p < .001, and the two-dimensional model featuring a combination of PK and PCK, χ 2(3) = 281.29, p < .001. Latent correlations retrieved from the three-dimensional calibration, between the test of CK and the test of PCK, between the test of CK and the test of PK, and between the test of PK and the test of PCK, amounted to .61, .05 and .25, respectively. In sum, it appears completely justified to view the three tests as assessments of distinct dimensions of professional knowledge.

To explore the external validity of the tests of PCK, CK, and PK, we investigated correlations with participants’ motivational characteristics, epistemological beliefs and beliefs about teaching. As expected the test of CK was significantly related to interest in math, math self-concept and the epistemological belief of math as a process. PCK was also significantly related to these math-specific measures, but to a smaller degree. However, it correlated to a higher degree than CK with a transmission belief about teaching math. PK was not significantly related to these math-specific measures.

3.2 Testing the Assumptions on PCK Development

In this section, we summarize the first findings of the experimental tests of the assumptions on PCK development. Please note that these are preliminary results that will need to be substantiated with more elaborative analyses (Troebst et al. in prep.). The control group, which exclusively received instruction on PK (CG weak) did not display significant PCK development either on the first or on the second day. The EG amalg group, which participated in lessons on CK on the first day and lessons on PK on the second day, yielded significantly larger PCK development than did CG weak. The EG suffi group, which was provided with lessons on CK on both days, also showed significantly larger PCK growth than did CG weak. EG facil, which featured lessons on CK on the first day and lessons on PCK on the second day, as well as CG strong, which participated in lessons on PCK on both days, demonstrated the largest PCK gains. Our design allowed further testing of the assumption that CK facilitates PCK development. Two groups, on one of the two treatment days, received exactly the same lessons on PCK, but differed in their prior CK: CG strong received the same lessons on PCK on their first day as did EG facil on their second day, after their participation in CK lessons on the first day. In our present, preliminary analyses, both groups exhibited the same gains in PCK in the course of their lessons on PCK.

4 Discussion

In the debate as to how to best prepare teachers, there are many speculations on the role of CK and PK in teacher education. However, these speculations are often not based on evidence. Our study is one of the first to address these questions in a randomized controlled trial (RCT). For the purpose of experimentally testing the aforementioned assumptions on PCK development, we designed courses for pre-service teachers that were aimed at manipulating teachers’ prior professional knowledge. Further, we constructed tests to assess participants’ PCK, CK, and PK. The courses and tests on CK and PK were closely attuned to those for PCK. Preliminary results indicate the high internal validity of our RCT. Our randomized assignment of participants to treatments resulted in baseline equivalence in our three measures of teacher knowledge. Moreover, treatment implementation checks revealed that participants’ PCK, CK, and PK were manipulated through our courses as intended. Video-based analyses will allow us to further probe the intended implementation of our courses. As our courses on PCK, CK, and PK resembled those courses ordinarily implemented in university-based teacher education, we also consider the external validity to be high. Our block courses could quite readily have been part of regular teacher education programs.

Concerning the measurement of pre-service mathematics teachers’ knowledge, our analyses yielded the following results. In the pretest, the tests of PCK, CK, and particularly PK, were comparably difficult for the participating pre-service teachers. Concerning PCK, tasks on teaching strategies and representations facilitating student understanding of fractions, appeared to be particularly difficult. With regard to CK, even some of the tasks on the computation of fractions were difficult for the pre-service teachers. However, all three tests proved to be sensitive with regard to our treatments. In the posttest, participants had substantially higher probabilities of solving the items. With regard to PCK, our main dependent variable, we observed a steady increase of participants’ average ability across the three measurement occasions, without floor or ceiling effects.

Multidimensional Rasch Analyses supported the three-dimensional structure of pre-service mathematics teachers’ professional knowledge. The three factors represented were PCK, CK, and PK. PCK and CK were more highly correlated than were PCK and PK whereas CK and PK were the least correlated. These findings support the notion of closely related subject matter knowledge: that is CK and PCK on the one hand, and general PK on the other hand (Ball et al. 2008; Shulman 1987). PK was substantially more weakly related to PCK than CK, although our PK test only included knowledge of learning and teaching that was closely attuned to the PCK construct. For this purpose, the PK test also featured knowledge of student conceptions, conceptual change theories, and teaching strategies to overcome student misconceptions, from a general perspective however. Correlations to external variables like interest in math and beliefs about the teaching of math provided evidence for external validity of our measures of teacher knowledge.

Preliminary tests of the three assumptions about PCK development pointed to the following results. Our control group, which received lessons on general PK only (CG weak), did not develop any PCK. As often assumed in the literature, we found—at least to a certain degree—evidence of an amalgamation of CK and PK. Further, CK seemed to be—also at least to a certain degree—sufficient for PCK development. However, two other routes to the development of PCK proved to be far more effective. The first route consists of explicitly addressing PCK: that is, knowledge of students, learning and teaching in concrete content domains (CG strong). The second route featured a combination of CK and PCK (EG facil). In all, these preliminary results indicated that there are different pathways to PCK development. The notion that CK and/or PK need to be transformed, seems to be not the only route to PCK construction. Actually, explicitly addressing the knowledge of students, learning and teaching in concrete content domains, whether with or without antecedent CK instruction, appeared to be the most effective pathway.

Evidence for the role of prior PK for the development of PCK appeared to be flimsy although our measures and treatments of PK and PCK were closely attuned. The control group receiving only PK instruction (CG weak) did not show any growth in PCK, and in the EG amalg we only detected comparably weak effects on PCK development. Moreover, the overall amalgamation effect was partly due to PCK development from CK only. These results call into question the role of general PK for the development of PCK. However, beyond the target of PCK development, PK should be considered an important dimension of teacher knowledge: for instance, with regard to effective classroom management (Voss et al. 2014). Moreover, conditions that improve the transformation of PK into PCK, and the applicability of PK in classroom teaching, should be examined in future research. Our previous results show the advantages of teacher-specific versus polyvalent traditional teacher education, respectively. Fostering CK and PK separately, as realized in polyvalent or traditional teacher education (Gess-Newsome 1999) appeared to be comparably the least effective, in terms of PCK development. Other routes to PCK development, as realized in EG facil and CG strong, seem to be far more effective.

However, our study investigated the development of pre-service teachers’ PCK in just one content area. Our results need therefore to be replicated in other subjects and with other groups of teachers: for instance, with secondary school teachers and with in-service teachers. Future studies could also consider the role of teaching experience in the process of PCK development and include measures of teachers’ actions. In the present study, we embedded a lesson preparation task into the follow-up assessment. These data will be considered in a subsequent publication. Finally, whereas we inspected the effects of separate CK and PCK courses, an investigation of integrated CK and PCK instruction would also be worthwhile.