Introduction

There is accumulating evidence regarding the importance of early mathematical skills in later mathematics learning and achievement (Duncan et al. 2007). Researchers find helping students build stronger foundations in the early years may lead to improved mathematics achievement in later years (Clements and Sarama 2009). Elementary Math Specialist programs have been identified as a hopeful approach to improving early childhood mathematics teaching and learning (Association of Mathematics Teacher Educators 2013; Campbell et al. 2013; Conference Board of Mathematical Sciences 2012; Reys and Fennell 2003). Primarily Math, a K-3 Math Specialist program, was designed to improve the quality of instruction and children’s mathematics achievement through addressing three critical issues: teachers’ mathematics knowledge for teaching, attitudes toward their own learning of mathematics, and beliefs about mathematics teaching and learning (Ginsburg et al. 2008).

Primarily Math (PM) consists of 18 credit hours of graduate-level mathematics and pedagogy courses that help teachers attend and respond to children’s mathematical thinking and reasoning. The courses aim to develop in teachers “…the habits of mind of a mathematical thinker” (Conference Board of the Mathematical Sciences 2012, p. 8), hoping that in turn the teachers will “develop flexible, interactive styles of teaching” (p. 8) that support comparable mathematical habits of mind in their students.

The goal of this study is to contribute evidence-based information on the effectiveness of PM by examining the impact of participation on teachers’ mathematical knowledge for teaching, their attitudes toward learning mathematics, and their beliefs about mathematics teaching and learning. These three teacher-level variables are directly targeted by PM and have all been linked to teachers’ choice of instructional practices (Wilkins 2008). Teachers may consolidate and incorporate what they have learned from professional development opportunities into aspects of their teaching, which can in turn affect students’ achievement. This consolidation has been found in research examining the impact of elementary mathematics coaches (Campbell and Malkus 2009) and Cognitively Guided Instruction, a program focused on student thinking and how it impacts instruction (Carpenter et al. 2000). Teachers are thought to transmit their influence on student learning via classroom practice (Adler et al. 2005; Fennema and Franke 1992). Thus, examining these proximal outcomes is the critical first step in linking professional development to changes in teaching practices and student achievement.

The targets of professional development

To teach young children effectively, teachers must strive to understand and interpret children’s mathematical thinking as well as teach mathematics skills and concepts in ways that are responsive to children’s developmental and learning needs (Clements and Sarama 2009; Ginsburg et al. 2008; Kilpatrick et al. 2001). Scholars and professional institutions generally agree that skillful teachers should have deep knowledge of the subject matter, but also pedagogical knowledge, constructive beliefs about the learning and teaching of mathematics, and positive mathematical attitudes. Thus, in line with the propositions of the National Association for the Education of Young Children (NAEYC 2002) and the National Council of Teachers of Mathematics (NCTM 2010), the focus of the professional development efforts of PM focused on teacher knowledge as well as their attitudes and beliefs about teaching and learning.

Effective teachers need to have deep mathematical knowledge for teaching (Ma 1999). Teaching mathematics for conceptual understanding requires mathematical content knowledge specific to tasks of teaching: posing mathematical questions, giving and appraising explanations, using and choosing representations, analyzing student errors and appraising students’ unconventional ideas, mediating discussion, and using precise language (Thames and Ball 2010, p. 223). Various researchers offer theories to frame a discussion around the kinds of knowledge important to instruction, including the National Research Council’s (2001) five strands of mathematical proficiency and Niss’ (1999) mathematical competencies underlying the Programme for International Student Assessment (PISA). A current theory is mathematical knowledge for teaching (Ball 1993; Ball et al. 2008; Lampert 1990, 2003), which also underlies the Teacher Education Development Study in Mathematics (TEDS-M; Tatto et al. 2012) frameworks for research on prospective teachers.

Mathematical knowledge for teaching is proposed to be a specialized body of knowledge. Ball et al. (2008) specified two major domains—subject matter knowledge and pedagogical content knowledge—which are further specified into six parts, including common content knowledge, horizon content knowledge, specialized content knowledge, knowledge of content and students, knowledge of content and teaching, and knowledge of content and curriculum. There is emerging evidence, primarily from the efforts of Hill et al. (2007), that having more mathematical knowledge for teaching is associated with higher-quality mathematical instruction, supports teachers’ abilities to interpret and respond to students’ mathematical productions (Hill 2010), and is related to mathematical elements of classroom work and student mathematical outcomes (Baumert et al. 2010; Hill et al. 2005; Hill et al. 2008).

Teachers need to cultivate positive attitudes toward their own professional learning of mathematics that leverages their ability to respond to the mathematics emerging within their practice (Borko 2004; Guskey 2010; Richardson 1996). Neale’s (1969) definition of attitude is described as “a liking or disliking of mathematics, a tendency to engage in or avoid mathematical activities, a belief one is good or bad at mathematics, and a belief mathematics is useful or useless” (p. 632). A large percentage of elementary teachers report mathematics anxiety (Hancock and Dawson 2001), negative histories with respect to mathematics learning experiences, and relatively low efficacy for teaching mathematics (Graven 2004; Grootenboer and Zevenbergen 2008; Hodgen and Askew 2007; Lerman 2012), as well as rank mathematics as their least favorite subject to teach (Wilkins 2010). Teachers’ attitudes toward learning mathematics influence their instructional practices, such as time allocation to mathematics instruction, the use of inquiry-based instruction, and the way teachers respond to student questions and problems (e.g., Beswick 2012; Novotná et al. 2014; Wilkins 2008). Indeed, educators must be cognizant of the influence they can have on students’ attitudes, as anxiety may be “contagious” (Beilock et al. 2010).

Finally, mathematics beliefs are “personal judgments about mathematics formulated from experiences in mathematics, including beliefs about the nature of mathematics, learning mathematics, and teaching mathematics” (Raymond 1997, p. 552). A body of literature documents how teachers’ prior and current beliefs about mathematics (Beswick 2012) and beliefs about teaching and learning (Zakaria and Maat 2012) are related to their teaching practices. Teachers of young children would be expected to endorse student-centered beliefs about mathematics teaching and learning balanced with strong understanding of the teacher’s role in guiding learning—an orientation common in early childhood teacher preparation and professional development programs. Teachers with student-centered beliefs understand the ways in which students construct their own knowledge through active investigation and meaningful exploration, and endorse “conceptualizations of mathematical learning and knowing emphasizing conceptual understanding, problem solving, reasoning, and sense-making” (Clark et al. 2014, p. 249). Mathematical beliefs play an important role in how teaching practices develop and evolve (Borko and Putnam 1996), teachers’ efforts to adopt reform-oriented practices (Wilkins 2008), and teachers’ uses of cognitive resources (Pajares 1992).

What does effective mathematics professional development for elementary teachers look like?

Although a wide range of activities are collectively labeled “professional development,” there is a consensus within the field of education regarding what constitutes high-quality professional development (see Elmore 2002; Putnam and Borko 1997; Wei et al. 2010; Wilson and Berne 1999). Borko’s (2004) theoretical article “maps the terrain” of research on the impact of professional development on mathematics teaching and learning. Three program features characterize “high-quality” programs, with activities that: enhance mathematical content knowledge, while situating teacher learning within classroom practice; help teachers to access and make students’ ideas and reasoning visible, in order to guide student thinking in mathematically productive and generative directions; and leverage teacher capacity for high-quality instruction.

Several features of professional development are important for influencing changes in teachers and teaching. Programs that focus on subject matter—such as Young Mathematicians at Work (Schifter and Fosnot 1993) and Cognitively Guided Instruction (Carpenter et al. 1989, 1996, 2000,)—strengthened teacher knowledge and classroom practice. Walker (2007) also provided descriptive, qualitative evidence of how the Dynamic Pedagogy Project was useful in helping teachers to develop meaningful lesson plans, select and design rich mathematical tasks, and reflect on their pedagogical decisions after implementing lessons.

In contrast, other professional development efforts have reported no significant impact of professional development on instruction (e.g., Piasta et al. 2015). In fact, the Coalition for Evidence-Based Policy (2013) reports 88 % of 90 highly regarded Institute for Education Studies funded since 2002 produced weak or null results, even in the presence of a widely implemented program with a well-designed evaluation. The causes of null results are multifold, pertaining to the intervention itself or the organizational structure that can support or undermine its effectiveness (Hill et al. 2016). It is instructive to examine the programs producing significant results on teacher or student variables in order to increase knowledge about which program inputs are associated with positive outcomes.

Cohen and Ball (1999) describe traditional in-service professional development as fragmented seminars that tend to underestimate what it would take to implement activities that would translate into instructional change. Effective professional development can be characterized by structural features such as spanning longer periods of time (e.g., >14 h; Yoon et al. 2007), and having more resources to compensate and support teachers within the classroom, and attracting more motivated participants. Heck et al. (2008) studied 48 projects in the National Science Foundation’s (NSF) Local Systemic Change through Teacher Enhancement Initiative. Consistent with research and professional recommendations, these NSF programs focused on content knowledge intentionally linked to classroom practice as well as other proximal influences on instruction and pedagogical decision making through opportunities for hands-on practice and reflection for an extended period of time (Garet et al. 2001; Hill 2009; Wayne et al. 2008).

According to Heck et al. (2008), K-8 teachers with more hours of professional development reported more positive attitudes toward Standards-based instructional practices, reported feeling more prepared to teach for conceptual understanding, and reported feeling more prepared to teach topics commonly covered across the K-8 program of study. Correspondingly, more hours of professional development were also associated with reports of more hands-on, investigative instructional practices (i.e., a student-centered, teacher-facilitated approach). Heck and colleagues conclude that professional development can positively influence teachers’ attitudes, perceptions of preparedness to teach, and their practice. Perhaps, professional development that is designed to address all of these goals in concert is more likely to have the desired effects on classroom instruction.

How knowledge, attitudes, and beliefs interact to influence instruction

PM posits a holistic approach to professional development, in which a teacher needs both cognitive (e.g., subject knowledge, pedagogy) and affective (e.g., confidence) resources to be able to teach mathematics effectively. This view corresponds with Ernest’s (1989) model of mathematics teaching, which delineates three teacher characteristics that influence teacher thinking and action from one moment to the next: knowledge, beliefs, and attitudes. These three components are not separate entities functioning independently in influencing classroom practices; rather there is interdependency among the components (Campbell et al. 2014; Holm and Kajander 2012; Ma 1999; Wilkins 2008). Researchers in mathematics education have used various definitions of knowledge, beliefs, and attitudes and do not always offer clear distinctions between these constructs—further testimony with respect to the interconnectedness among these components. McLeod (1992) and Philipp (2007) offer some working definitions of these terms, which we do not repeat here, and instead focus on the relationships among mathematical knowledge, beliefs, and attitudes.

Several studies have examined interconnections between teachers’ mathematical beliefs and content knowledge (e.g., Campbell et al. 2014; Holm and Kajander 2012; Wilkins 2008). Teachers’ mathematical content knowledge has been found to influence classroom practices in positive ways (Fennema and Franke 1992; Lloyd and Wilson 1998); however, teachers with strong content knowledge do not necessarily teach mathematics in a way that facilitates mathematical understanding for all children (Mewborn 2001). Ball (1991) suggests teachers with similar kinds and levels of mathematical knowledge may teach very differently: teachers’ beliefs about the nature of mathematics and the teaching and learning of mathematics affect what mathematics is taught, as well as how teachers promote such learning. Teachers’ mathematical beliefs influence the ways in which teachers translate their subject knowledge into instruction.

In addition, beliefs are also influenced by teacher knowledge: “beliefs may be dependent on the existence or, perhaps, the absence of knowledge” (Cooney and Wilson 1993, p. 150). Further, teachers’ lack of knowledge can impede their abilities to understand students’ mathematical thinking and address students’ misconceptions—even if the teachers strongly embrace student-centered beliefs about mathematics learning and teaching (McDuffie 2004). There is a general consensus that beliefs and content knowledge need to be addressed simultaneously to support teachers’ efforts to facilitate student inquiry.

Teachers’ mathematical attitudes also interact with their knowledge and beliefs. Teachers are learners of mathematics themselves, particularly in the context of professional development. Teachers with more mathematical knowledge are likely to be more confident and motivated and less anxious toward learning mathematics (Kalder and Lesik 2011). Increasing teachers’ mathematical knowledge has been found to be effective in fostering positive attitudes toward mathematics learning (Haylock 1995; Matthews and Seaman 2007). Conversely, individuals who enjoy learning mathematics may learn more than their counterparts with negative attitudes, as learning can be facilitated or impeded by emotions (Sutton and Wheatley 2003).

Teachers’ attitudes toward mathematics can also influence teachers’ beliefs: the formation of beliefs involves individuals’ evaluations of the feelings and emotions related to personal experiences (Anderson 2005; Nespor 1987). Teachers’ attitudes toward their own learning of mathematics influence their views on the effectiveness of various teaching methods. Wilkins (2008) reported teachers with more positive attitudes toward the learning and teaching of mathematics were more likely to believe in the effectiveness of inquiry-based instruction, and they used such instruction more frequently.

Mathematical knowledge, beliefs, and attitudes are not only influenced by one another—they also affect instruction in complex ways. Wilkins (2008) statistically modeled the relationship among elementary teachers’ mathematical knowledge, beliefs, and attitudes in relation to inquiry-based instructions. Beliefs about the effectiveness of inquiry-based instructional practices mediated the relationship between mathematical content knowledge and teacher-reported frequency of inquiry-based instructional practices, as well as the relationship between attitudes toward mathematics and teacher-reported frequency of inquiry-based instructional practices. In fact, teachers’ beliefs about the effectiveness of inquiry-based instructional practices were the strongest predictor of instructional practice, above and beyond teacher background characteristics such as their years of teaching experience, their highest degree, and the number of mathematics courses taken in preparation coursework. Wilkins concludes that efforts to strengthen content knowledge without also helping teachers to develop positive attitudes and beliefs limit the value of learning the content. PM believes professional development needs to target teacher knowledge, beliefs, and attitudes concurrently to produce optimal proximal outcomes in teachers’ practices if the distal goal is student learning and achievement.

Research questions

PM is defined by the core components that characterize high-quality professional development and recognizes that teacher knowledge, attitudes, and beliefs are associated with enhanced capacity for high-quality instruction. This study examines the extent to which teachers’ mathematical knowledge for teaching, attitudes toward mathematics learning, and beliefs about teaching and student learning changed after participating in PM. In order to answer this research question, we conducted two sets of analyses. The first set of analyses focused on whether teachers’ scores upon and after completion of coursework were different from their initial scores (i.e., within-cohort change). If changes were observed within each cohort of PM teachers, a second set of analyses was conducted. The second set of analyses focused on whether changes found in PM teachers were different relative to a group of non-PM teachers (i.e., between-group change relative to a comparison group).

Methods

PM approach and intervention

The architects of PM constructed the program with four fundamental ideas. First, PM is a program for volunteers—participants must be motivated and have the desire to change their practice. Second, the program must take place over an extended period of time, and assignments in pedagogy courses must link directly to teaching practice. Third, participants must be a part of a cohort and encouraged to work on learning content together. Fourth, the program of study must reflect graduate-level coursework expectations and result in graduate credit. These four fundamental ideas come from evidence of effective professional development (e.g., Archibald et al. 2011) and from the PM architects’ past experiences with teacher professional development.

Program architects also referenced Wilson and Berne (1999), who used case studies of exemplary in-service professional development opportunities to highlight three characteristics relevant to the acquisition of professional knowledge specific to teaching: (1) opportunities to talk about subject matter; (2) opportunities to talk about students and learning; and (3) opportunities to talk about teaching. PM coursework was designed to include all three aspects of exemplary professional development as well as provide time for growth and change. Thus, participation in PM is an experience characterized as a “resource-intense, often slow process, requiring time, reflection, and conversation” (Smith 2012, p. 302). Although published after the creation of PM, Chen and McCray’s (2012) approach to professional development also aligns with PM foundations: “the Whole Teacher framework emphasizes promoting all aspects of a teacher’s development, including attitudes, knowledge, and practice” (p. 9).

PM professional development structure

PM is a six-course (18 graduate credit hours), 13-month program. Of the six courses, three focus on increasing teachers’ mathematical knowledge for teaching, and the other three focus on pedagogy and child development. A key difference between undergraduate courses for prospective elementary teachers and the courses in PM is the focus on student thinking and how course ideas can translate into teaching practices. Additionally, connections to curricula and students are possible because practicing teachers have a working knowledge of curricula and access to students that prospective teachers do not. Table 1 depicts the sequence of PM courses.Footnote 1

Table 1 Sequence and focus of PM coursework

The three mathematics courses are designed to develop teachers’ mathematical habits of mind and are taught by teams of mathematicians, K-3 teachers, and mathematics graduate students in ways that model student-centered teaching. In the first year, courses were taught by the project principal investigators (PIs). Project faculty and graduate students took over as lead instructors in subsequent years, maintaining close contact with the PIs to ensure the integrity and quality of instruction and support were sustained.

Summer Institutes are intense, concentrated immersion experiences; the instructional style is best described as “guided exploration,” employing a careful balance of instructor- and participant-directed tasks. Classes meet from 8 a.m. to 12 p.m. and 1 p.m. to 5 p.m. Monday–Friday, and nightly homework typically requires three to four hours. An end-of-course assignment is due to approximately 3 weeks after the institute and is used to reinforce mathematics learned in the course. Sessions are highly interactive, beginning with small group discussion of the previous night’s homework. Most of class time is spent with teachers solving problems in groups, teachers presenting solutions to the class, and whole-class discussions of solutions and representations that draw connections to K-3 mathematics curriculum issues. Class typically ends with instructor-directed discussions to formalize and bring closure to the mathematical concepts being introduced. Teachers quickly appreciate the benefit of group work and most remain in the classroom at the end of the day for a couple of hours to collaborate on homework assignments with the assistance of the instructional team.

The courses offer a foundation for developing the “habits of mind of a mathematical thinker” (Cuoco et al. 1996), particularly as they relate to problem solving, reasoning, and writing mathematical explanations. In fact, teachers engage in “Habits of Mind” problems throughout summer course work. These problems have multiple entry points as well as multiple solution pathways, making them ideal for individual and group work.

One example teachers found particularly engaging was the “Chicken Nuggets” problem.

A particular brand of chicken nuggets come in boxes of 6, 9, and 20. What is the largest number of chicken nuggets you cannot order exactly? How do you know?

A problem like this does not lend itself to the type of algorithmic approach teachers often try initially, but instead is ideal for “messing about” (Hawkins 1974) as a means for teachers to find and justify an answer. Additionally, this problem provides a good basis for a discussion about what constitutes proof and helps teachers develop productive mathematical practices (e.g., perseverance; reasoning and communication; CCSS 2010). Importantly, these types of problems also offer an alternative image of how learning can take place in the classroom.

The three pedagogy courses concentrate on increasing teachers’ knowledge of pedagogy and child development. The courses are taught by teams of mathematics educators and graduate students, developmental psychologists, and K-3 master teachers. During the fall semester, teachers take the pedagogy course Teaching Math K-3: Planning Lessons for Diverse Learners, which is a blend of in-person and distance education. Teachers meet face to face several times during the semester, with the remainder of the course online. Meetings include instructor-facilitated discussions as well as small group sharing and collaboration. Teachers submit assignments online throughout the semester and had opportunities to seek support from peers and instructors. In the spring semester, teachers take another pedagogy course, with a comparable structure to the fall course.

The content of assignments takes advantage of teachers’ access to children. Assignments require teachers to implement instructional practices intended to develop their ability to skillfully draw out and respond to children’s mathematical thinking. For example, in the fall pedagogy course, teachers use two Talk Moves (Chapin et al. 2009), such as say more or press for reasoning. Teachers film and transcribe their use of their selected Talk Move and see how such practices play out in their own classrooms, instead of relying on sample videos or reading assignments from coursework alone.

In the second summer, teachers participate in another Summer Institute and the third pedagogy course Communities of Practice and Mathematics. Teachers situate their individual lesson planning within the mathematical ideas of the elementary curriculum, giving particular attention to creating coherence and connections to the learning trajectories of children. Additionally, the course focuses on learning specific strategies for organizing teaching to effectively facilitate learning, including how to space learning over time, use worked examples, choose and use a variety of representations, and create and use deep questions and explanations to enhance learning. Participants are asked to apply their learning to their work as classroom teachers through extended lesson and unit plans.

Program intentions for learners

PM developers were intentional about the sequence of courses and coursework: mathematical content knowledge is the root of pedagogical content knowledge for teaching (Schwab 1978). The leadership team determined coursework would start with mathematics immersion, followed by pedagogy courses during the academic year so assignments could be tied directly to classroom experiences. Mathematically rich, open-ended problems in the content courses—coupled with pedagogy assignments that require teachers to attend to and situate student thinking within learning trajectories—may support growth in knowledge for teaching. For example, in the fall semester pedagogy course, teachers complete a child study project where they formally document their observations and analysis of the mathematical thinking and learning trajectories of two students. Without deep mathematical content knowledge, teachers might not be able to notice students’ mathematical understanding, connections, or misconceptions.

Courses are structured so teachers benefit from having time to extend their reflections and the flexibility to choose how professional development can support their classroom practice. For example, teachers engage in cycles of lesson planning which involves collaboratively planning a lesson, video-taping their lesson, reflecting with peers on various aspects of the lesson, and revising their lesson plan for future implementation. Engaging in these cycles of inquiry, prompted by specific questions upon which to reflect, provides teachers with multiple opportunities to consolidate what they learn from their individual and collaborative experiences.

Course structures afford teachers the flexibility to adapt professional learning opportunities to their school and classroom contexts. In a semester-long project on family–school partnerships, teachers were asked to find ways to build partnerships with the families of the students in their classroom centered on mathematics. Teachers articulate different goals and strategies to develop partnerships with diverse families. One teacher chose to send home mathematics games because many parents had working hours that did not enable them to attend school functions. Another teacher hosted “Math and Muffins” morning because she noticed it was easier for parents to come to school in the morning rather than the afternoon or evening. In each instance, teachers had the flexibility to make decisions about the most effective strategies for establishing partnerships unique to their classroom composition (see Fleharty and Edwards 2013 for more on this project).

PM participants

PM was designed to be repeated for multiple cohorts of teachers across time. The admission process was similar across each district. At the request of the school districts, the research team conducted random assignments based on buildings, rather than individual teachers. Having teachers from the same building participating simultaneously would better allow collaboration with their peers.

The first three cohorts were scheduled to run in groups in three different cities. After teachers were selected for the first three cohorts, a matched control group was selected from buildings in core-partner districts with no PM participants, and teachers were matched on both student and teacher demographic characteristics.

Teachers submitted applications reviewed by four people—two at the university and two from their district or regional Educational Service Unit (ESU). Teacher applications included a resume, transcript, two essays, and a principal support form. It was important to the leadership team that applicants selected demonstrated potential for growth in both leadership and teaching practices in their applications. The applicants also willingly committed themselves to graduate-level coursework in mathematics and pedagogy on top of their full-time teaching responsibilities. After rating applicants, the final admission decision was made by the PI, taking into account agreed upon allotments of PM slots to districts and areas.

Accepted teachers were assigned to a cohort, with staggered starting dates. The first cohort started coursework in the summer of 2009, while the second cohort started coursework in the summer of 2010, and so forth. Upon completion of courses, teachers could enroll in an optional leadership course. Some attrition of teachers originally selected for the second and third cohorts occurred prior to beginning coursework: replacement teachers were selected from the same (or nearby) districts.

PM focused its research agenda on the first three cohorts of teachers and a matched comparison group. Thus, the analysis presented here capitalizes on data from these four groups of teachers from three large, urban school districts. To date, PM has been offered to 14 cohorts. As of Summer 2016, a total of 405 teachers have completed the program. While some data are being collected from later cohorts, only the first three were included in the focused research project.

Study design and teacher characteristics

The research agenda evaluated the impact of PM on a subset of teachers in multiple school districts with a matched comparison group. A total of 218 teachers participated in the study, including 126 PM teachers and 92 comparison group teachers. Participating districts include three core partnership districts and 28 smaller districts with only one or two participating teachers from each district in most cases. Matching was based on building-level characteristics: student enrollment, race/ethnicity, and socioeconomic status (SES; see Table 2).

Table 2 Teacher characteristics of PM and comparison teachers (at start of PM or when teachers entered comparison group)

Attrition

Ninety-six percent of teachers who began PM coursework also completed the 18-credit program of study. Four teachers decided not to begin PM after being assigned to Cohort 2. Two teachers were recruited to take their place. In 2011, 19 total teachers out of the two Cohort 3 groups decided not to begin PM coursework and 30 new teachers were recruited to fill those two groups. PM teachers who decided not to begin the program and comparison group teachers who dropped out reported three kinds of reasons for stopping participation: (1) the individual moved out of state or retired; (2) the individual was assigned to a position outside of K-3; or (3) the individual cited a personal situation (e.g., illness; pregnancy). “Appendix 1” depicts the timing of courses taken and measurement occasions.

Procedures and measurement

All data were collected annually each summer from 2009 to 2013 (see “Appendix 1”). Table 3 highlights the significance of each measurement occasion with respect to cohorts’ status within the professional development program. Cohorts 2 and 3 have multiple baseline measurement occasions. We named the measurement occasion prior to their PM start date “Pretest.” In 2009 and 2010, surveys were administered via paper/pencil at in-person meetings to teachers in cohorts each summer and administered online to teachers in comparison groups. Beginning in 2011, all surveys were administered online. Instruments were chosen to align with the PM theory of change and research questions in order to measure program effectiveness.

Table 3 Sequence of measurement occasions

Mathematical knowledge for teaching surveyFootnote 2 (MKT; Hill et al. 2004)

The MKT (versions A and B) assesses teachers’ mathematical knowledge for teaching and aligns with the mathematical content for teaching curriculum of the PM K-3 Math Specialist Certificate program which focused on numbers and operations and, to a lesser extent, algebra and geometry. The MKT was selected as it was specifically designed to capture elementary school teachers’ mathematical knowledge for teaching (Ball et al. 2008) and is a widely used instrument to assess elementary teachers’ knowledge of the mathematics most relevant to teaching mathematics in meaningful ways to children. The MKT has a multiple-choice format and items situate the mathematics within teaching-specific scenarios. For example, an item may require the respondent to evaluate non-conventional solution methods or represent mathematical content to children. Teachers must know more than procedures and standard algorithm to answer the questions.

This instrument contains 36 and 34 questions for versions A and B, respectively, and contains multiple parts per item within three subscales: Number and Operations; Patterns, Functions, and Algebra; and Geometry. Raw scores are converted to item response theory (IRT) scores (based on a nationally representative sample of K-6 teachers) such that 0 represents the MKT of an average K-6 teacher. During the data collection phase, teachers first took version A and then version B, continuing to alternate between versions across the data collection years. The two versions are equated so teachers’ scores could be compared across versions.

Fennema–Sherman Mathematics Attitudes Scales for Teachers (FSMAS; Fennema and Sherman 1976)

A goal of PM was to increase teachers’ confidence and foster positive attitudes in their ability to learn mathematics. The FSMAS-T, as we adapted it, captures these attitudes. Teacher attitudes toward their own learning of mathematics were measured by three of the nine original FSMAS subscales. Each original scale has 12 items, and responses are measured using a 5-point Likert scale (where 1 = “Strongly Disagree” and 5 = “Strongly Agree”). When the research team started PM in 2008, no instrument on elementary practising teachers’ mathematical attitudes was available. The research team reviewed literature on students’ mathematical attitudes and decided to adapt the FSMAS for the use among teachers. The FSMAS was originally developed to assess high school students’ attitudes toward mathematics (Fennema and Sherman 1976) and is among the most popular instruments used in studies of students’ mathematical attitudes.

We adapted and revised the items of three subscales (as needed) to ensure they were applicable to present-day teachers, rather than students. The three scales included: (1) Confidence in Learning Mathematics, which measures one’s confidence in his/her ability to learn and to perform well in mathematics; (2) Mathematics Anxiety, which measures one’s “feelings of anxiety, dread, nervousness, and associated bodily symptoms related to doing mathematics”; and (3) Effectance Motivation, which measures whether one enjoys and seeks challenges regarding mathematics. Our selection of scales aligned with the goals of the PM, as well as the literature on teacher attitudes, classroom instruction, and student outcomes.

This adapted instrument was first piloted and then validated using several samples of primary teachers. The adaptation and validation procedures were reported in detail in Ren et al. (2016). Three items were removed based on factor analyses (Ren et al. 2016). In the current study, coefficient alphas were calculated using teachers’ responses when they completed this survey for the first time: Confidence (10 items; α = .92), Effectance Motivation (11 items; α = .93), and Anxiety (12 items; α = .94). All three scales received excellent alpha reliability based on the rules of thumb provided by George and Mallery (2003): “≥ .9” is “Excellent,” “≥ .8” is “Good,” “≥ .7” is “Acceptable,” “≥ .6” is “Questionable,” “≥ .5” is “Poor,” and “≤ .5” is “Unacceptable.”

Mathematics Beliefs Scales (MBS; Capraro 2001; Fennema et al. 1990; Ren and Smith 2013)

Another goal of PM was to cultivate more student-centered instructional practices. The MBS captures potential shifts in beliefs about teaching and learning that range from more traditional teacher-centered perspectives to more constructivist, student-centered perspectives. The Mathematics Beliefs Scales is a widely used measure of teacher beliefs about mathematics teaching and learning. We used the short form, composed of 18 items, to manage the length of the questionnaire. Responses are measured on a 5-point Likert scale (where 1 = “Strongly Disagree” and 5 = “Strongly Agree”). The instrument aligned with the goals of the PM: we hoped teachers would embrace progressive beliefs (e.g., students’ active role in learning) toward mathematics learning and teaching.

Using data from a subset of the sample in the current study, Ren and Smith (2013) examined the factor structure of the short-form MBS. After eliminating four problematic items, a two-factor structure was identified: student-centered beliefs (6 items; α = .78) and teacher-centered beliefs (8 items, α = .86). Teachers who value student-centered teaching believe students construct their own knowledge through active investigation and meaningful exploration. Teachers who value teacher-centered teaching believe students should be told or shown how to do mathematics and that it is important for students to always solve problems as efficiently as possible.

Results

Results are reported by categories of teacher outcomes in the order of teacher knowledge, attitudes, and beliefs. For each outcome category, we first report whether teachers’ scores at end of coursework are different from their initial scores (within-cohort change), specifically for the Cohort 1 teachers. We highlight Cohort 1 because it provided the most follow-up data, enabling us to observe whether changes were sustained over time. Then, we report whether the changes found in PM teachers (all cohorts combined) are different relative to a group of non-PM teachers (between-group change relative to comparison group). “Appendix 2” contains tables providing estimated changes between each consecutive measurement occasion for all eight outcomes for each cohort and the associated p values.

To examine within-cohort changes, data were analyzed using linear mixed models for repeated measures in SAS (v. 9.4) to estimate the overall pattern of differences in knowledge, attitudes, and beliefs across five measurement occasions (see “Appendix 3” for full model specification). The interaction between cohort and time in program status (i.e., where an individual teacher is within the sequence of PM coursework) was specified as a predictor to examine whether each PM cohort showed changes across measurement occasions; city was specified as a covariate. Repeated measurements on the same teacher were allowed to covary; compound symmetry was determined to be the most parsimonious covariance model with adequate fit. We conducted post hoc analyses using the “CONTRAST” and “LSMESTIMATE” statements in SAS to further examine the specific differences between any two measurement occasions for each cohort.

To examine between-group change, we calculated “change scores,” specified as the difference between teachers’ initial scores and scores 1 year after completion of PM. We chose to compute the difference between pretest scores and the first follow-up scores for all of the teacher outcomes for two reasons. First, the posttest was administered to Cohorts 2 and 3 participants before they started the second Summer Institute and was therefore not a true posttest. Second, we believe teachers may need time to consolidate their knowledge and skills obtained from the program, which then results in measureable change: analysis of 1-year follow-up scores reflects this theory better than posttest scores. After calculating change scores, we compared change score slopes for PM teachers with those of the comparison teachers.

Effect sizes are also reported for within-cohort and between-group effects for each outcome. MKT IRT scores are a standardized unit and mean differences within cohort can therefore be interpreted as an effect size. According to the Learning Mathematics for Teaching Project (LMT 2004), effect sizes are considered noteworthy when greater than one quarter of a standard deviation: growth of .3 standard deviations is small, but significant growth of .5 is moderate, while growth of .75 and above is substantial and considered to represent a large effect size. Effect sizes for within-cohort change and the difference in the mean change between PM and comparison group teachers for all other outcomes were estimated using recommendations outlined in Olejnik and Algina (2000) for multifactor designs. Hill et al. (2008) suggest a minimum detectable effect size of .25 for the effect of an intervention to have “educational significance” (p. 30), while Cohen (1988) recommends standardized mean differences of .2, .5, and .8 for small, medium, and large effects, respectively.

Mathematical knowledge for teaching

PM teachers strengthened their mathematical knowledge for teaching in the areas of Number and Operations as well as Geometry. Within the MKT assessment framework, IRT scores are generated based on raw scores, where a score of 0 represents the national average among K-6 teachers, and a score of 1 or –1 is one standard deviation above or below the national average.Footnote 3

Prior to their participation in PM, Cohort 1 teachers’ initial Number and Operations IRT score was estimated to be –0.23 on average, which is about one-fifth of a standard deviation below the national average for K-6 teachers. However, at the end of coursework, Number and Operations IRT scores increased by 0.62 (p < .001), placing participants’ average score above the national norm for K-6 teachers—a significant marker for positive growth that translates into a moderate effect size. Interestingly, Cohort 1 teachers’ MKT scores continued to grow after coursework completion. We observed an additional growth of 0.52 (p < .01) in teachers’ Numbers and Operations IRT scores 3 years after completion of coursework (i.e., the difference between posttest and the third follow-up measurement occasion), a moderate effect size. Figure 1 presents the pattern of change in MKT IRT scores for Cohort 1 teachers. Cohorts 2 and 3 also demonstrated comparable change (see “Appendix 2”).

Fig. 1
figure 1

Patterns of change in Cohort 1 teachers’ MKT IRT scores across time

Since we observed the growth in PM teachers’ Numbers and Operations IRT scores for each cohort, we then examined the trajectory of change of the combined PM cohorts relative to the comparison group. PM teachers had a steeper slope relative to the comparison group, F(3, 496) = 2.82, p = .04, suggesting that PM teachers grew more than comparison teachers in their knowledge of Number and Operations. This difference in slopes translates to a small but educationally significant effect size of .31 (see Fig. 2).

Fig. 2
figure 2

Mean Number and Operations IRT scores across time for Cohorts 1 through 3 and the comparison group

Similarly to Number and Operations, Cohort 1 teachers’ initial mean Geometry IRT score of −0.27 was below the K-6 national average (i.e., IRT = 0). However, after completing PM, teachers grew by 0.35 (p = .006), placing them around the national average, a positive small effect. Cohort 1 teachers also continued to grow after coursework completion. We observed an additional growth of 0.30 in teachers’ Geometry IRT scores 2 years after completion of coursework (p = .03). Cohorts 2 and 3 also grew in desired directions from pretest to posttest, but not significantly (see “Appendix 2”). When comparing change scores between PM and comparison teachers, we found PM teachers had a steeper slope relative to the comparison group F(3, 540) = 4.95, p = .002, suggesting that the PM teachers grew more than the comparison teachers in their knowledge of Geometry from their pretest scores to their first follow-up scores, an effect size of .36. PM did not have a significant impact on teachers’ Patterns, Functions, and Algebra scores.

Attitudes toward learning mathematics

Teachers reported more confidence in their ability to learn mathematics after participating in PM. On average, Cohort 1 teachers’ Confidence scores increased by 0.18 from pretest to posttest (based on a five-point Likert scale) (p = .02), an effect size of .25. This increase was sustained 3 years after the completion of coursework, as indicated by no significant changes in teachers’ confidence ratings between posttest and follow-up occasions (see Fig. 3). On average, Cohorts 2 and 3 showed the same pattern of changes in their reported confidence levels. Combining the three PM cohorts, PM teachers had larger changes in reported confidence levels from pretest to the first follow-up occasion, relative to the comparison teachers, F(3, 494) = 7.09, p < .001, which translates to an effect size of .28.

Fig. 3
figure 3

Patterns of change in reported levels of Confidence, Effectance Motivation, and Anxiety for learning mathematics across time for Cohort 1

Teachers also reported higher levels of motivation to learn mathematics after participation in PM. On average, Cohort 1 teachers’ Effectance Motivation scores increased by 0.22 over the duration of PM (p = .01), an effect size of .32. This increase was sustained during 3 years after completion of coursework (see Fig. 3). Cohort 2 showed the same pattern of change; however, Cohort 3 showed an increase in reported motivation both during PM and after the completion of PM. PM teachers showed larger changes in reported Effectance Motivation levels relative to the comparison teachers, F(3, 499) = 18.45, p < .001, which translates to an effect size of .60.

Finally, teachers reported less anxiety toward learning mathematics after participating in PM. Cohort 1 teachers’ Anxiety scores decreased by 0.27 over the duration of PM (p < .001), an effect size of .35. This decrease was sustained 3 years after completion of the program (see Fig. 3). Cohort 2 showed the same pattern of change, while Cohort 3 continued to decrease significantly both during PM and after the completion of PM. PM teachers showed larger decreases in their anxiety toward learning mathematics relative to the comparison teachers, F(3, 496) = 9.26, p < .001, which translates to an effect size of .42 (see Fig. 4).

Fig. 4
figure 4

Mean reported levels of Anxiety across time for Cohorts 1 through 3 and the comparison group

Beliefs

Teachers reported lower levels of teacher-centered beliefs after participation in PM. On average, Cohort 1 teachers’ teacher-centered beliefs decreased by 0.24 (based on a five-point Likert scale) over the duration of PM (p = .02), an effect size of .39; this decrease was sustained 3 years after completion of PM coursework (Fig. 5). Cohort 3 shows similar patterns of change from pretest to posttest. While Cohort 2 returned to more teacher-centered beliefs 1 year after completing PM (p = .02), they still reported lower levels of teacher-centered beliefs in the follow-up years than their initial scores indicated. PM teachers showed larger decreases in their teacher-centered beliefs relative to the comparison group, F(3, 507) = 14.04, p < .001, which translates to an effect size of .67.

Fig. 5
figure 5

Patterns of change in student- and teacher-centered beliefs across time for Cohort 1

Teachers reported higher levels of student-centered beliefs after participating in PM. On average, Cohort 1 teachers’ student-centered beliefs increased by 0.45 over the duration of PM (p < .001), an effect size of .91. This increase was sustained 3 years after completing PM coursework (see Fig. 5). Cohorts 2 and 3 showed similar patterns of change from pretest to posttest, while Cohort 3 continued to show increases in their levels of student-centered beliefs from posttest to the first follow-up occasion. PM teachers showed larger increases in their student-centered beliefs relative to the comparison group, F(3, 512) = 17.61, p < .001, which translates to an effect size of .80 (see Fig. 6).

Fig. 6
figure 6

Mean student-centered beliefs across time for Cohorts 1 through 3 and the comparison group

Overall, participation in PM yielded positive results. Bloom et al.’s (2008) criteria for “educational significance” were met: effects sizes for all teacher outcomes exceed .25 (see Table 4). Interpretation of effect sizes for within-cohort and between-group change for the MKT was based on the guidelines established by the LMT Project (2004). Within-cohort change and between-group change for the attitudes and beliefs outcomes were based on Cohen’s (1988) criteria.

Table 4 Effect sizes describing the impact of participation in PM on teacher knowledge, attitudes, and beliefs

Discussion

PM produced positive effects on teachers’ cognitive and affective domains deemed important to ambitious mathematics instruction described in both theory and prior research (Heaton 2000; Lampert 2003). Although teachers changed in ways that aligned with the broader goals of PM, the program and research design preclude the isolation of specific “key components” that led to the statistically significant changes in teacher outcomes. PM is based on a “whole teacher” approach (Chen and McCray 2012) in which teachers’ knowledge, attitudes, and beliefs are all addressed in tangent through course content, sequence, and structure; a single course assignment or a learning structure is not necessarily responsible for changes in a particular teacher outcome; rather, all the components were indispensable for collectively promoting teacher change.

Additionally, the target teacher outcomes—knowledge, attitudes, and beliefs—are highly interconnected psychological constructs. Thus, it is not unexpected that changes in one may have a consequential (if not cascading) effect on others (Campbell et al. 2014; Ernest 1989; Holm and Kajander 2012; Wilkins 2008). For example, it may be the case that changes in mathematical content knowledge for teaching fostered more confidence in teachers regarding their capacities to learn mathematics (Haylock 1995; Kalder and Lesik 2011; Matthews and Seaman 2007), as well as facilitated changes in teacher beliefs about children’s capacity to construct mathematical knowledge (Ball 1991; Cooney and Wilson 1993; McDuffie 2004). Individual teachers may follow different trajectories of change during professional development, even though they may all arrive at the same destination (Franke et al. 1997). Thus, teachers may “interact” with the various components of the PM program differently; this allows teachers to assimilate the program in ways that correspond to their individual learning styles to maximize their growth.

Linking PM to previous research

Existing theories and empirical evidence may in part explain the pattern of positive changes (across the set of targeted outcomes) in teacher outcomes. PM activities were similar to those found in other high-quality professional development programs—particularly the focus on mathematical content knowledge for teaching linked to classroom practice (Borko 2004; Carpenter et al. 1989, 1996, 2000; Cuoco et al. 1996); Elmore 2002; Schifter and Fosnot 1993; Wilson and Berne 1999).

A related focus of the mathematics content courses is developing teachers’ mathematical habits of mind (Cuoco et al. 1996). Smith and Shen (2012) qualitatively examined the trajectories of change in teachers’ habits of mind across participation in PM. Teachers realized their capacities to learn mathematics were malleable, and their mathematical knowledge could become a professional strength: “I am pleased to say that through the struggles and doubting my abilities in higher-level mathematics, I have grown as a mathematician…and I am proud of my perseverance” (Assignment, June 18, 2010). By having teachers work on problems with multiple solution paths and potential representations, and then communicate their reasoning to others, the mathematics courses model ways teachers can engage with students in their classrooms. This goal was realized in the classroom of one PM teacher, who noted that her students:

…analyzed and evaluated the thinking and strategies of others…[and] often made connections more when they were connecting to their peers rather than to [me]…Students learned to be a community of problem solvers…[and they] selected and applied different representations to solve problems. (2009)

Based on testimony like the one above and others like it, we speculate that the habits of mind teachers cultivated as they developed their knowledge for practice (Cochrane-Smith and Lytle 1999) may support their inclination and capacity to promote similar habits of mind within their students. Future work may examine the processes through which this may occur.

In addition, assignments required teachers to use coursework as a lens through which to situate children’s mathematical understanding within a learning trajectory (Clements and Sarama 2009). Also potentially critical were teachers’ cycles of pedagogical action (e.g., using a Talk Move; Smith and Stein 2011) and reflection (e.g., analyzing the video and transcript of the Talk Move to study the efficacy of the teaching moves from one moment to the next). Such cycles of action and reflection are considered important for changing teacher beliefs and perhaps also practice (e.g., Franke et al. 1997). Previous research has shown that simply telling teachers how they should teach or showing them models of teaching is far from enough to change beliefs; intense experiences with students and deep reflections on these experiences are crucial to changing teachers’ beliefs about mathematical teaching and student learning (Ambrose 2004; Cooney et al. 1998; Franke et al. 1997; Grant et al. 1998; Mewborn 2001). Various assignments in PM pedagogy courses required teachers to reflect on their actions, which might have enabled them to really see students’ thinking and strengths in dealing with mathematics, leading to an endorsement of student-centered beliefs. Furthermore, the cycles of action and reflection were made more powerful through the support of their cohort peers (as found by DuFour 2004; Hill 2004) and over an extended period of time (as found by Garet et al. 2001; Hill 2009; Wayne et al. 2008; Yoon et al. 2007).

Teacher perspectives

In addition to our statistical analysis, feedback received from participants supports our interpretation of program effectiveness. The shared experience of successfully solving complex problems often shifted teachers’ perceptions about their capacity to learn mathematics.

It is truly hard to pick one part or one aspect of this program. The relationships that I have made are one key aspect. As you sit through the classes you rely on one another. … The instructors and assistants were always there to comfort and help you. I never once felt alone. I also greatly enjoyed the mathematical discussions where I was pushed to think about my own thinking. …I was able to take all the strategies, tools, and Math Talk and put them into action in my classroom. (Participant)

Having peers and instructors for support may have ameliorated the anxiety teachers might otherwise feel if they had to tackle course assignments alone.

Moreover, we believe the content, sequence, and structure of the program as a whole contributes to teacher changes. Below is another teacher’s reflection that was quite typical of PM participants.

Reflecting on what I have learned … all children are in great need of more in-depth exposure, experience and practice with number sense. …We need to teach topics more in-depth and ask “how,” “why,” and “are there other ways to solve this mathematics problem?” … A deep understanding of number sense will impact children’s thinking and build their foundation. (Participant)

This teacher’s reflection speaks to the need for mathematical connections and the necessity for building depth of conceptual understanding, particularly related to number sense.

Limitations

These results should be interpreted in light of the following limitations. The first limitation concerns self-selection. It may be the case that teachers who applied to participate in PM were more knowledgeable and/or motivated to engage in mathematical thinking and learning relative to the general K-3 teaching population. Indeed, PM teachers went through a significant application process, including two essays addressing their perceived challenges in teaching K-3 mathematics and the depth of their own understanding of elementary mathematics content and pedagogy. The applicants also willingly committed themselves to graduate-level coursework in mathematics and pedagogy in addition to full-time teaching responsibilities.

To address the potential volunteer bias, we examined whether PM teachers had statistically different baseline scores relative to the comparison group for each outcome. On average, PM teachers started out with higher mathematical knowledge for teaching, more positive attitudes toward learning mathematics, and more progressive beliefs about teaching and student learning relative to the comparison group at pretest. Therefore, self-selection may explain these significant differences in initial baseline knowledge, attitudes, and beliefs. Nevertheless, participating in PM is an intensive, time-consuming, and demanding experience. Even though self-selection might complicate the interpretation of statistical analyses, it is one of the four fundamental ideas on which PM was based. While motivation was an important and possibly a distinguishing factor in who applied, motivation alone was not likely to translate into desirable teacher outcomes if there had not been a focused and coherent professional development program (Hill et al. 2016; Wilson and Berne 1999).

A third limitation is attrition, although rates were quite low. Ninety-six percent of teachers who began PM coursework also completed the 18-credit program. Some attrition did occur prior to teachers’ beginning PM, especially for Cohort 3 teachers who were asked to wait for 2 years before beginning courses, but these teachers were replaced by similar individuals who underwent the same application process as those who had dropped out.

Scholarly contribution

PM is unique among Math-Science Partnerships in its focus on lower elementary teachers. Parallel to the “whole child” approach, PM simultaneously attended to fostering the cognition (knowledge) and affect (attitudes and beliefs) of participants within the envelope of a learning community composed of peers. This study contributes to existing knowledge of what constitutes effective professional development for in-service lower primary teachers in three ways. First, our program models how a focused and coherent professional development program can be delivered to multiple cohorts of teachers across Nebraska. The school districts involved used different mathematics curricula, and thus, PM is not limited to a specific curriculum. Moreover, PM was delivered by a changing team of instructors (composed of faculty, graduate students, and master teachers) and thus replicable at other sites. Second, PM is an intensive, long-term Elementary Mathematics Specialist program. The longitudinal research design included multiple follow-up data collection to measure the extent to which the impact of professional development was sustained over time, rather than utilizing a simple pre-post design that may capture a halo effect. Finally, this study reports on multiple types of teacher-level outcomes that are broadly accepted as relevant and associated with high-quality mathematical instruction. In particular, we focus on teachers as learners of mathematics—a variable not often specified as a programmatic outcome—which PM sought to foster. Additionally, we found that this program was particularly successful for helping teachers to embrace student-centered beliefs, such that children are capable of constructing and exploring mathematical concepts.