Introduction

In the USA, the Common Core State Standards for Mathematics (CCSSM) represents an effort to guide the country toward a “substantially more focused and coherent” (Common Core State Standards Initiative 2010, p. 3) mathematics curriculum. For the more than 40 US states that have adopted these standards, the CCSSM also generally represent a more demanding set of learning goals than prior standards (Porter et al. 2011; Schmidt and Houang 2012). Teachers working to implement the new standards are often doing so without updated textbooks, and many are developing their own lessons or using free materials obtained from the internet (Davis et al. 2013). District leaders have also pointed to insufficient materials as well as a limited capacity of teachers to select and adapt materials to organize coherent sequences of instruction as major obstacles to CCSSM implementation (McLaughlin et al. 2014).

As teachers are positioned as designers of their own standards-aligned curriculum, it is critical they receive professional development concerning the quality of curriculum materials and support in aligning materials to the CCSSM. Curriculum publishers’ own claims of CCSSM alignment have shown to be untrustworthy (Polikoff 2015), and the very nature of academic achievement standards necessarily opens them to varying interpretations (Sadler 2014). It is teachers’ interpretations of standards, not the standards themselves, which shape how teachers implement them (Hill 2001, 2006). Teachers’ varied beliefs, knowledge, and experiences influence how they make sense of standards, leading to widely differing interpretations (Spillane 2004). So, too, do their interactions with colleagues, which can lead to locally shared interpretations of standards that diverge from policymakers’ intentions (Coburn 2001).

One productive strategy pursued in mathematics education for building teacher capacity has been professional development organized around the analysis of mathematical tasks. Professional development focused on mathematical tasks has helped teachers to select tasks that support high-level student reasoning (Stein et al. 2009). There is evidence that task-based professional development can increase teachers’ selection and implementation of cognitively demanding tasks (Boston and Smith 2009, 2011) and change the way teachers understand how tasks influence student learning (Boston 2013).

In this article we explore one school district’s effort to use professional development involving mathematical task analysis to support its efforts to build Algebra 1 teachers’ capacity for the implementation of new standards. We co-designed this professional development with district leaders and implemented it with a cadre of teacher leaders over the course of a year, during the district’s early efforts to implement the CCSSM. From the beginning, it became clear that district leaders, teachers, and researchers held multiple goals for the joint work, including augmenting current curriculum materials and building a common understanding of the standards. The competing goals and values of project stakeholders manifested themselves in a number of design tensions (Tatar 2007) related to the task-based professional development. Therefore, this study investigates the following questions:

  1. 1.

    What design tensions emerge in the process of co-designing task-based professional development for high school Algebra 1 teachers in a large, diverse urban school district?

  2. 2.

    How do design tensions influence the evolution of the professional development?

Background

The Inquiry Hub project, funded by the National Science Foundation (NSF), was an effort by an ongoing research–practice partnership (Coburn et al. 2013) that brought together school district curriculum leaders, teachers, university researchers, and Web engineers. The primary research goals of this partnership were to understand how diverse groups of stakeholders can come together to design innovative approaches to the creation and adaptation of digital STEM curricula. The partnership was originally funded in 2008 to design an online curriculum repository and planning tool for a digital Earth science curriculum (Lee et al. 2014; Sumner 2010). Responding to other needs in the school district, the Inquiry Hub partnership expanded its work to include learner-centered curriculum in both mathematics and science and to support teachers’ use of this curriculum in adaptive and learner-centered ways. For mathematics, district curriculum leaders expressed a particular need to help high school Algebra 1 teachers prepare for changes brought on with the adoption of the CCSSM.

A team of researchers and district leaders from the partnership undertook an effort to co-design professional development both for and with a cadre of teacher leaders. Co-design is a highly facilitated, team-based process in which educators, researchers, and developers work together in defined roles to design an educational innovation, realize the design in one or more prototypes, and evaluate each prototype’s significance for addressing a concrete educational need (Penuel et al. 2007). In Inquiry Hub, the educators included both district leaders and high school teachers, and the “prototypes” were a coordinated set of activities organized around protocols we developed or adapted for identifying, analyzing, and discussing the qualities of mathematical tasks. This article describes the design activity of the 2012–2013 school year, the first year of Algebra 1 work within Inquiry Hub. The year was marked by two distinct phases: (1) a cooperative effort by district leaders and researchers to define a set of task quality criteria aligned to the goals of the CCSSM and attentive to the needs of students for whom English was not their primary language, and (2) iterations of co-design with all stakeholders in which teachers enacted the task analysis routine and provided feedback about its use. We are presently evaluating these prototypes against how well these processes helped teachers develop common understandings of the CCSSM and how tasks aligned with the standards (Johnson, in progress).

From the beginning, we became aware that different participants brought different experiences and purposes to their participation in the co-design process. Prior to the Algebra 1 work in Inquiry Hub, district leaders had worked with the Institute for Research on Learning (IFL) to help teachers analyze task quality, seeing it as a particularly promising approach to build teacher capacity. Both the district and teacher participants were concerned about the mismatch between their adopted curriculum materials and the CCSSM; for them, identifying and analyzing tasks was a means for augmenting those materials. For researchers, the project presented an opportunity to study the development of teachers’ pedagogical design capacity (Brown 2009) and to continue to work with project web engineers to expand the online curriculum repository to become a platform for teacher authoring and adaptation of materials.

Both as part of the co-design process and retrospectively, it can be useful to study how teams surface and manage these different goals. During the co-design process, attending to goals is critical, as it facilitates participants’ ownership over the design process (Penuel et al. 2007). Retrospective analysis of the kind we present here can inform larger, macrocycles of design (Gravemeijer and Cobb 2013) by helping design teams identify principles for future co-design efforts. In this instance, our aim was to analyze the tensions that emerged over the course of the year in order to identify key conditions for using mathematical task analysis as a professional development strategy and to understand how doing so might support the implementation of challenging new learning goals for all learners such as those embodied in standards like the CCSSM.

The design tensions framework

The design tensions framework (Tatar 2007) is a way of conceptualizing design as a process in which goals are balanced across the needs of multiple stakeholders. The framework highlights that “design exists because of the tension between what is and what ought to be” (p. 415, emphasis original) and uses tensions to focus on the values of the stakeholders and the negotiations and compromises that exist in response to conflicts in design. Design tensions stand in contrast to design spaces (cf. Card and Krueger 1998), an approach to understanding design that focuses on categories of independent design choices and the permutations of possible designs. Whereas design spaces presume all choices are equivalent, design tensions foreground resource limitations and the fact that choices arise from multiple goals. Design tensions draw attention to how choices result in trade-offs, insights, or a reformulation of the problem the design is intended to address.

Tatar’s design tensions framework characterizes four levels where tensions may arise. Vision is the highest level, “a fundamental expression of the values and interests of the project goals” (p. 417) that comes from the tension between what is and what ought to be. Approach is the second level, entailing “the expression of an intended implementation” (p. 418). This is where designers formulate actions that will reconcile their current reality (the “what is”) with their goals and values (the “what ought to be”). In the approach level, tensions exist around such things as technical capabilities, the abilities of project members, and the policy environment in which the project exists. The third level of the framework, project tensions, reflects actual implementation decisions. Project tensions exist where design work is typically most visible, within designers’ scope of influence where “means, ways, and values come into conflict” (p. 418). The final level of the design tensions framework represents as-created situations, where consequences of design decisions exist as new situations and dilemmas. These consequences may benefit the process of design, or they may be in tension with goals at other levels of the framework.

Past research has identified several ways that identifying and surfacing design tensions can benefit design teams and help build research knowledge. During the design process, surfacing tensions can help teams build relationships and trust among participants by attending to and naming different goals. Retrospectively, an analysis of design tensions can reveal critical turning points in design processes and identify principles for guiding future design efforts (Penuel et al. 2014). Attention to tensions can also identify inequities of participation in the design process, particularly those arising from historically accumulating tensions among different role groups (Severance et al. 2014). In the present analysis, we explore how an analysis of design tensions can build research knowledge and inform future efforts to organize professional development around mathematical tasks.

Multiple purposes for using mathematical tasks in professional development

For more than two decades, researchers have attended to the role of instructional tasks in mathematics classrooms. Instructional tasks shape the products students produce and how they produce them (Doyle 1983) and in mathematics are defined as “a classroom activity, the purpose of which is to focus students’ attention on a particular mathematical idea” (Stein et al. 1996, p. 460). Instructional tasks mediate student learning in the classroom: The more the students encounter cognitively demanding tasks in instruction, the better they perform on tests of sophisticated mathematical thinking and reasoning (Hiebert and Wearne 1993; Stein and Lane 1996).

Cognitively demanding tasks are ones that require students to engage in “complex, non-routine thinking and reasoning such as making and testing conjectures, framing problems, representing relationships, and looking for patterns” (Stein and Kim 2009, p. 42). Moreover, sequencing tasks to place increasing demands on students for sophisticated forms of mathematical thinking and reasoning is key to supporting students’ growth along hypothesized learning trajectories (Simon and Tzur 2004).

The use of instructional tasks has also been a focus of teacher professional development. Arbaugh and Brown (2005) conjectured that task sorting exercises would be a non-threatening way for teachers to examine their own practice, and found that teachers changed their categorizations of tasks over time to better reflect levels of cognitive demand. Stein et al. (2009) developed a case book for use in professional development that includes a number of tasks, rubrics for analyzing task qualities, and protocols for discussion of tasks among teachers. Subsequent research on activities grounded in their approach (Boston and Smith 2009; Boston 2013) have shown changes in teachers’ task selection after task-focused professional development, with some teachers sustaining that effect over time (Boston and Smith 2011).

Task-based professional development has also been used in ways other than to attend to the cognitive demand of tasks. Swan (2007) designed professional development centered on a series of task types that enabled participating teachers to examine, and in some cases, shift their beliefs toward a more student-centered and connectionist approach to teaching mathematics. Elliott et al. (2009) worked with teacher leaders to develop a program of task-based professional development that used frameworks of sociomathematical norms (Yackel and Cobb 1996) and practices for orchestrating discussion (Stein et al. 2008) to develop teachers’ mathematical knowledge for teaching (Ball et al. 2008). These varied uses of task-based professional development illustrate the ways mathematical tasks provide a practice-based context for the development of different kinds of teachers’ knowledge, beliefs, and practice.

The projects listed above underscore the fact that task analysis can serve multiple purposes. Most research reports, however, highlight one purpose or analyze multiple outcomes, without documenting whether multiple goals were in play and, if so, what trade-offs were involved in the designs that were implemented. For the purpose of contributing to professional development research in isolation from its context, focusing on the efficacy or potential of task analysis does not necessarily present a problem for the field. But when professional development is embedded within larger educational systems, such as a school district, professional development designers must coordinate their efforts with other initiatives (Jackson and Cobb 2013). This includes attending to the differences in aims of administrators and teachers that arise from their different roles and responsibilities (Penuel et al. 2004), as well as accounting for the instructional realities (Zhao et al. 2004) faced by teachers in the form of resource constraints and varied pressures and initiatives that compete for teachers’ attention. Our conjecture, which we explore in this article, is that design tensions arise that require teams to adapt and evolve their professional development designs related to mathematical tasks, particularly when the professional development is situated within larger reform efforts like the CCSSM.

Methodology

The current study explores design tensions related to the task-based professional development that was designed and implemented during Year 1 of the Inquiry Hub project. To address our research questions about tensions that emerge and influence the evolution of task-based professional development, we relied on participant observation and an analysis guided by Tatar’s (2007) design tensions framework. We used field notes and transcripts of design meetings and professional development sessions, as well as interviews and survey data. The study participants, data sources, and approach to identifying tensions are described below.

Participants

Four general stakeholder communities were active participants in Inquiry Hub during the 2012–2013 school year: university researchers, an engineering team responsible for the online repository, curriculum supervisors from an urban school district, and high school algebra teachers from the district. The five core members of the research team spanned the disciplines of the learning sciences, cognitive science, computer science, organizational studies, and mathematics education. The five key members of the Web engineering team included programmers, a designer, and a program manager who played a significant liaison role by attending meetings with both the researchers and district supervisors. Three key district curriculum personnel included supervisors of mathematics and science, one of whom was a co-principal investigator in the current study. Other stakeholders were occasionally involved in the work of the project, such as staff from the district assessment and technology offices, but noteworthy stakeholders absent from co-design included building principals and instructional coaches.

The co-design process was structured for teachers to have a significant influence on the products of the partnership. The district supervisors selected teachers for the project with the goal of representing varying levels of algebra teaching experience and expertise with curriculum development. Over the course of the 2012–2013 school year, 12 teachers participated in total, typically in groups of 6–8 at any one time. These teachers represented nine different high schools from across the district, and most were teaching ninth grade Algebra 1 at the time of their participation. Teachers formed what was known as the Teacher Advisory Board (TAB), a name chosen to emphasize the teachers’ role as co-designers. During TAB meetings, researchers and district supervisors regularly solicited teachers’ input to guide the overall project direction and to predict how specific design activities might be valued by other teachers.

Sources of data

The sources of data for the present analysis are field notes, transcripts of meetings, a survey completed by teachers at the end of the 2012–2013 school year, and interviews with district curriculum supervisors conducted in the Fall of 2014. Field notes of meetings were used as the primary source data. Meetings are seen as more than a coincidental setting for joint work; they are complex cultural events where groups negotiate collective goals, power and authority, devise action strategies, and carry out action (Schwartzman 1989; Sprain and Boromisza-Habashi 2012). Meetings make a useful focal point for analyzing how tensions emerge in project activity and how they are sustained, reformulated, or resolved over time. Our corpus of meeting data includes field notes from 14 weekly meetings of the research team, 25 weekly meetings between the researchers and district supervisors, and eight meetings of the TAB with teachers, researchers, and district supervisors that occurred between December of 2012 and May of 2013.

Approach to identifying and analyzing design tensions

To identify design tensions within our dataset, we need to distinguish tensions from problem solving or decision making. First, we applied this guidance from Tatar:

The tension could be constituted by a dichotomy between two goals, or by a continuum, or by the relevance of two or more incommensurate forces. What unites the elements in a tension is the competition, within the framework of the project, for one or more limited resources. If only one constituent exists, there is no competition, no tension and no need to balance. (Tatar 2007, pp. 445–446)

Tensions provide a means to “conceptualize design not as problem solving, but as goal balancing” (Tatar 2007, p. 415). In terms of guiding our analysis, design tensions require orienting to conflicts where often the optimal outcome is an optimal compromise (Tatar 2007). Looking for instances of opposition, contradiction, or competition in the discourse of participants provided a potential pool of tensions from which a subset concerned the analysis of mathematical tasks.

In order to declare participants’ discourse as indexing a design tension, we applied four particular criteria. First, the discourse had to include overt talk and deliberation about a task or project activity and also include an overt justification, such as a reason or course of action, for the proposed task or activity in relation to a goal or valued end. Second, the discourse had to also include two or more proposals for the task or activity. In doing so, the discourse demonstrated a competition between possible stances and a need to balance them (Tatar 2007). Third, in terms of gauging the whether a tension had enough of a presence to serve as a design tension within a project, deliberations within discourse had to have an extended nature (e.g., 15 min or more) within a session and/or surface over multiple sessions. Fourth and last, a design tension had to lead to some observable result such as a change or a decision regarding the design. Such a change or decision may manifest itself as a change to an object of design (e.g., a task rubric), a change to the rule or rationale for applying the object of design, a change to the dimensions underlying a design object or process that reflects participants’ priorities or goals, or a reaffirmation of a previous position (e.g., “doubling down” on a certain course of action).

Identified tensions were categorized in accordance with the four levels of design tensions specified by Tatar (2007): vision, approach, project tensions, and as-created situations. Recall, the vision tension embodies design as a value-laden enterprise and describes the tension of “what is…and what ought to be” (Tatar 2007, p. 417), essentially what participants see as the overall purpose or objective of their work and the current state of their work. The approach tension sits below the vision tension and describes the tensions encompassed in choosing between potential general approaches to realizing the vision of “what ought to be” (Tatar 2007, p. 417). Below the approach tension, the project tensions describe the tensions surrounding the “actual decisions in implementing” (Tatar 2007, p. 418) an approach, the fine-grain decisions of seeing an approach enacted. Last, the as-created situations describe possible tensions created as a consequence of actions taken to realize the overall vision (Tatar 2007).

Results

Several design tensions focused on mathematical tasks were prominent in Inquiry Hub during Year 1 of the work in Algebra 1. Three are described here, one each at the approach, project tension, and as-created situations levels of the design tensions framework (Table 1). At the approach level, there was a tension in the selection of task attributes to be used by teachers in the analysis of mathematical tasks. At the project design level, tensions persisted around the design of a rubric for analyzing the language of tasks. Lastly, at the as-created situation level of the design tension framework was a tension related to modifications teachers might make to tasks upon implementation. There was relatively little tension at the vision level of the framework, with an agreed goal of improving Algebra 1 teachers’ capacity to enact CCSSM reforms through curriculum improvement.

Table 1 Design tension framework

Tensions in the approach: negotiating the qualities of tasks to consider in analysis

The decision to organize project work around mathematical tasks was negotiated between district curriculum supervisors and researchers between July and December of 2012. When interviewed about key decisions made in the Algebra 1 work of Inquiry Hub, two of the three district supervisors, Hillary and Michelle, identified the decision to focus on mathematical tasks as an important decision for the project. Hillary recalled “the back and forth with [the researchers] and us about what would be the factors that would go into task analysis, and that was when we really committed to the cognitive demand [of tasks]” (interview, October 23, 2014). When prompted, the third district curriculum supervisor, who focused mostly on science in the district, recalled that the decision to focus on tasks was led by Hillary because of her expertise in mathematics education and the shared belief that teachers “aren’t going to get high-level answers if you don’t ask high-level questions” (Laura, interview, October 10, 2014). All three district supervisors referred to rigor and/or cognitive demand in their interviews and described tasks in ways that communicated their vision for high-quality mathematics teaching and learning in the district, using messaging similar to that used with teachers in TAB meetings at different points throughout Year 1 of Inquiry Hub.

The “back and forth” referred to by Hillary is reflected in meeting notes from the Summer and Fall of 2012. Initially researchers suggested a focus on “productive adaptation” of curriculum (meeting notes, July 23, 2012) as a useful approach for preparing Algebra 1 teachers for the CCSSM, including the development of authoring tools for teachers (meeting notes, August 28, 2012) as well as analyzing teachers’ use of teacher-created materials (meeting notes, September 4, 2012). Researchers also surfaced a need for teachers to do task analysis in a way that was simple but rooted in learning sciences and mathematics education research (meeting notes, August 28, 2012). District supervisors pressed the team to pursue task analysis as a focal point for joint work and requested that researchers find or develop a selection of rubrics and guides for rating tasks along dimensions of standards alignment, cognitive demand, and language. When researchers suggested a focus on learning trajectories as an alternative to tasks, Hillary responded with “Why not just use the curriculum guides we already have?” and “I don’t want to sound too pedestrian, but I want us to help teachers identify and use tasks that extend our current program” (meeting notes, September 24, 2012).

The researchers recognized the stronger research base around the selection and use of mathematical tasks, heeded Hillary’s recommendation, and recognized how a task rating process might allow for the investigation of productive curriculum adaptation and other aspects of teacher practice valued within mathematics education and learning sciences research. Task-based work could also be coordinated within the district’s existing curriculum infrastructure by placing high-quality tasks rated by teachers in the online curriculum repository alongside digital versions of publisher materials already adopted by the district. Researchers proceeded to assemble a selection of task guides and rubrics based on relevant research literature, while attempting to consider what teachers actually do in implementing tasks (meeting notes, September 25, 2012). The proposed rubrics and guides began with the district supervisor-suggested qualities of cognitive demand (Stein et al. 2009) and language (Moschkovich 2012). Researchers added task “launch” (Jackson et al. 2012), cultural relevance (Taylor 2011), and use of technology (meeting notes, October 9, 2012) to the proposed list of qualities and guides to consider.

This approach tension concerning task qualities was particularly evident in negotiations concerning a proposed rubric for evaluating the cultural relevance of tasks. Researchers proposed adapting a framework developed by a colleague (Taylor 2011) for purposes of task analysis; for us, considering ways to connect mathematical tasks meaningfully to student experiences was an important equity consideration. The team chose not adopt this rubric: Hillary argued that cultural relevance was better considered as an aspect of teacher planning for a particular group of students rather than a general characteristic of task quality (meeting notes, October 22, 2012). Michelle agreed, stating, “If teachers determine it’s a worthwhile task, there ought to be a place to make some notes about how that task is supported,” again suggesting a future phase of work focused on supporting the implementation of tasks. We deferred to district leaders in this instance, rather than pursuing this particular approach to foregrounding equity at that time. Before the first TAB meeting with teachers, the list of task qualities to consider with the TAB was narrowed to the alignment with CCSSM, cognitive demand, language, and technology.

Project tension: co-design of language rubrics

One of the most prominent project tensions in Year 1 of the Inquiry Hub project was seen in the iterative design cycles needed to revise rubrics that assessed the language demands of mathematical tasks. Unlike cognitive demand, which had a sound foundation in research and a well-established framework for use in professional development, rubrics for the language of tasks that could be used in a similar manner were not known to district leaders or project researchers. The district, with about 30 percent of its students not speaking English as their first language, had a number of initiatives designed to give English learners full access to educational opportunities. District curriculum leaders expressed a desire to support these efforts in the task analysis routine, prompting researchers to draft an initial language rubric organized around levels of academic language support to be identified in tasks.

Tensions in the use and revision cycles of the language rubric were rooted in two significant differences between stakeholder groups. First, teachers, district leaders, and researchers differed in their goals for the language rubric. In the first TAB meeting, teachers expressed a desire to have a language rubric that was either borrowed from or similar to materials they were already using from a district-provided professional development program for English language acquisition (ELA). District leaders resisted some of the teachers’ suggestions, thinking the extensive curriculum quality guides from the ELA program would be too burdensome to use with individual tasks in the context of our envisioned task analysis routine. Researchers, not initially aware of the scope and details of the district’s ELA efforts, focused their attention on two related ideas about language of tasks supported in research (Moschkovich 2012): the difficulty of the vocabulary a student would need to know in order to engage with the task, and the ways in which the task allowed students to demonstrate their understanding of mathematics.

The second significant difference in the stakeholder groups was, somewhat ironically, alternative preferences regarding language about language—or, more descriptively, the set of words used in the rubrics to codify various uses and interpretations of language in tasks. When an initial attempt at a single language rubric failed in the first TAB meeting, the third author of this paper proposed two language rubrics, one focused on demands that students engage in the language practices of mathematics and another targeted on access to the mathematical content. Within the group of researchers, there was time to discuss these two ideas at length and come to a consensus understanding of demand and access. Teachers, however, struggled to use the rubrics consistently and questioned the meaning of these two terms. This particular tension persisted through multiple TAB meetings until it was understood by researchers that the rubrics used the terms demand and access in ways contradictory to another district ELA effort.

The continued tension—evident in the difficulty the team faced in coming to agreement on task ratings using demand and access—prompted researchers to use new descriptors, form and function, that agreed with teachers’ prior ELA experiences and had a foundation in educational research on language (Solano-Flores 2010). In subsequent meetings, the new rubrics slowly gained acceptance with teachers, though inter-rater agreement remained a challenge.

As-created situation: analyzing tasks as written versus task modification

With a project approach focused on the analysis of mathematical tasks, and design efforts directed at teacher consensus-building through the use of task rubrics and discussions, a situation was created in which tasks needed to be analyzed as written. A tension between analyzing tasks as written versus modification of tasks first emerged when an early draft of the language rubrics suggested task modifications for English language learners, which raised concerns from Hillary: “One thing I worry about is, how will a teacher know if a task is appropriate for modification? Or if it has no guide for modification?” (meeting notes, October 22, 2012). Rather than modifying tasks, the district supervisors requested supporting materials for English language learners that could support all tasks, including those in the district-adopted textbook. It was agreed that the development of modification and implementation supports could be pursued in a future phase of the project and that task analysis would apply to tasks only as written. To provide an example, the district supervisors sent sample tasks to the researchers for which supporting material in the forms of standard alignments and a lesson plan had been added, but the task itself remained unmodified from the original.

In the very first TAB meeting, teachers resisted the notion of analyzing tasks as written and instead focused on their intended uses of the task. They expressed difficulty in divorcing themselves from the particular contexts of their own classrooms and their perceptions of their students’ abilities to engage in the task. For example, when discussing cognitive demand, teachers indicated their ratings would depend on where in the curriculum they might use the task, or if the task was to be used with a relatively higher- or lower-ability group of Algebra 1 students. They were particularly concerned with using certain cognitively demanding tasks “as is” with students whom they judged to be of lower ability. Similarly, when rating tasks for technology, some teachers made assumptions about the technology their students would use even though no technology use had been explicitly called for in a task. Seeking consistency in the rating process and consensus among raters, the researchers and district leaders encouraged teachers to evaluate the tasks only as written and their “qualities independent of the particular groups of students” (meeting notes, December 1, 2012).

Despite a focus on tasks as written, consistent task rating agreement amongst teachers remained elusive throughout the TAB meetings. Teachers’ desire to adapt tasks to their classroom contexts was also evident in the year-end teacher survey. When asked what factors influenced their use of tasks not captured in the rubrics, answers included “individual student abilities,” “the needs of my students,” “whether the task will be engaging/interesting to my students,” and “level of engagement from the students.” Some teachers also questioned the value of task rubrics and the rating process, preferring to either have more flexibility to modify tasks or have a larger selection of tasks to choose from. When asked how they would choose to design professional development around the CCSSM, survey responses included:

Olive: “I would want a focus on how these resources can be used in my unique situation”

Tina: “I really just wanted to focus on creating better tasks … I don’t really care too much about the rubric”

Vickie: “[I would give] teachers resources that would enable them to create their own tasks”

Reflecting in their follow-up interviews during the Fall of 2014, both Hillary and Michelle revisited the decision to avoid task modification. Michelle questioned if teachers could “separate the task as written from how they imagined using it” (interview, October 20, 2014) while Hillary believed task analysis should be about “looking at the task and what the task is actually asking kids to do, not how you teach the task” (interview, October 23, 2014). Yet, both saw missed opportunities to support task modification in professional development, saying it “could’ve engendered some really rich conversations” (Hillary interview, October 23, 2014) and how it could have been helpful for teachers to talk “about how modifications change the cognitive demand” (Michelle interview, October 20, 2014).

Managing the tensions: adapting and evolving the professional development

As a co-design team, we assessed our progress at the end of Year 1 using data from teacher participation and interviews and made plans to evolve our approach to task-based professional development. Though we did not pursue task modification as an approach in Year 2, as teachers had requested, we did expand the task analysis processes to include activities in which teachers could develop additional supports for implementing tasks. Also as part of this effort, we developed guides to help focus on the launch of tasks and maintenance of cognitive demand, using materials adapted from Jackson et al. (2012). And while the task analysis routine continued to ask teachers to consider tasks as written, the design of the online curriculum repository was changed to display task ratings as a distribution rather than as a consensus rating (Fig. 1).

Fig. 1
figure 1

Display of task rating variability in the online curriculum repository

To us, the evolution of the project is significant for two reasons. First, it represented a path for managing tensions that accounted for multiple goals: expanding teachers’ agency in design, considering varying student experiences and preparation for high-demand tasks, maintaining cognitive demand, and promoting equity. The shift we made also illustrates an important quality of our partnership with district leaders, namely the commitment on the part of researchers and district leaders to adjust the process to ensure that multiple stakeholders’ goals could at least partly be met.

Discussion

The analysis of tensions underscores some familiar conflicts that arise not just within professional development but also within policy implementation. Some professional development prizes teachers’ role as designers of curriculum, highlighting teachers’ capabilities and understanding of students. By contrast, other professional development emphasizes the value of giving teachers models of materials that can heighten their expectations for students. Policy researchers have pointed to a fundamental paradox related to tensions we observed, namely that “Policies aim to solve problems, yet the key problem solvers are those who have the problem” (Cohen et al. 2007, p. 515).

Co-design with teachers has the potential to alleviate some tensions typically associated with top-down approaches to professional development and policy implementation. Researchers have highlighted difficulties with professional development models that position teachers as receivers of researchers’ knowledge and instead propose that researchers and teachers mutually engage in work around artifacts common to their respective communities (Kazemi and Hubbard 2008; Sztajn et al. 2014). By organizing our work around mathematical tasks, both researchers and teachers participated as stakeholders in a co-design process in which there was a common interest in identifying and resolving task-related design tensions. The Inquiry Hub project represents a further extension of this approach by broadening participation to include district curriculum leaders as key partners in design. The instructional realities teachers face sometimes include goals or beliefs that oppose those of administrators within their school district. There may be no more useful way of understanding teachers’ institutional contexts than to include more of that context in co-design, as doing so provides opportunities for tensions between teachers and their leaders to be understood and resolved.

In emphasizing that tensions go beyond the interaction of two communities, and even beyond the expanded list of stakeholders in Inquiry Hub, we wish to point to the importance of considering the ways that all professional development is embedded in wider contexts that should be taken into account in design. These wider contexts include values that inform participants’ suggestions for design directions, past and concurrent initiatives that compete for teachers’ attention and allegiance, and the varied experiences and capabilities of students in teachers’ classrooms. These presented themselves in our project as different goals, which were discussed but not always adopted, sometimes deflected, and sometimes just deferred. Yet, mathematical tasks and their qualities were a focal point for all of us, bringing the varied activities of our communities into alignment for a time, while the design tensions framework helped us understand how the choices we made attempted to balance our multiple values and goals.

Conclusion

High-quality mathematical tasks can be a centerpiece of efforts to implement new standards, including the CCSSM, but task-based professional development is not implemented in a vacuum. Teachers, district leaders, and researchers are likely to find different possibilities within such professional development, as well as see different constraints that they must satisfy in their respective contexts. Particularly relevant are teachers’ instructional realities that can make task implementation difficult, and how understanding tasks in the institutional context requires an approach that goes beyond simple delivery of professional development from researchers to teachers.

Partnerships like ours are a promising approach for understanding problems and investigating solutions to improve educational systems because they can be adaptive and evolve designs in response to emerging implementation challenges. As we have shown, too, it is possible to manage at least some of the tensions that arise within the process of designing and implementing professional development for a group of teachers, especially when partners share in a common vision. At the approach level of the design tensions framework, researchers and district leaders negotiated a set of task qualities to use in a task analysis routine, resolving a tension rooted in our sometimes common and sometimes competing goals. At the project tension level, researchers, district leaders, and teachers iterated the design of language rubrics until they better reflected a common understanding of each other’s knowledge and resources. The consequences of our decisions created a situation where the goal of task rating consensus was in tension with teachers’ eagerness to interpret and modify tasks for the contexts of their classrooms.

Partnerships are not without their challenges. Even when partners share a common vision and generally agree upon an approach, project tensions stemming from the lack of design consensus affect participation and learning. Attending directly to these tensions, however, can help partners understand how compromises attempt to optimally balance goals, values, and resources. Building professional development efforts around mathematical tasks continues to be a promising approach to implementing new standards, and successful confrontation of project tensions in a collaborative design process could yield new task adaptation and implementation practices that, while difficult to achieve, have sustainable impacts across an educational system.