Introduction

The incorporation of argumentation in the mathematics classroom has earned growing appreciation in recent years (Krummheuer, 2007; Staples & Newton, 2016). Firstly, mathematicians construct knowledge socially by generating and evaluating alternative arguments. Secondly, studies suggest that argumentation requires students to investigate, challenge, and evaluate alternative positions, and to object, support and/or justify diverse ideas and hypotheses, thereby fostering meaningful understanding and deep thinking (Asterhan & Schwarz, 2016; Francisco & Maher, 2005; Weber et al., 2008). Thirdly, argumentation-promoting instruction has been shown to nurture the students’ mathematical autonomy and encourage positive attitudes toward mathematics (Yackel & Cobb, 1996). Recent reform documents worldwide emphasize argumentation as an important academic goal (e.g., CCSSI, 2010; Israel Ministry of Education, 2019).

In parallel, research suggests that teachers are inadequately prepared to recognize and exploit argumentation opportunities and argumentation in the mathematics classroom is not yet widespread (Bieda, 2010; Jacobs et al., 2006; Sriraman & Umland, 2020; Staples et al., 2012). It appears crucial to investigate how best to design effective professional teacher learning for enhancing argumentation in the mathematics classroom. We addressed this issue by building on existing teacher-noticing research and exploring a particular type of noticing, which we call:noticing of argumentation (Ayalon, under revision).

We regard argumentation as having two important and interrelated aspects—structural and dialogic (Jiménez-Aleixandre & Erduran, 2008; McNeill & Pimentel, 2010). The structural aspect focuses on discourse in which a claim is supported by an appropriate justification. The dialogic aspect focuses on the interactions between students as they generate ideas and critique those of their peers (Asterhan & Schwarz, 2016; McNeill & Pimentel, 2010; Mueller et al., 2012).

In general, noticing a situation in the classroom involves three interrelated skills: attending, interpreting, and responding (Jacobs et al., 2010), all of which are considered crucial in determining teachers’ proficiency (ibid). Therefore, we hypothesize that teachers who are better equipped to notice argumentation possess important skills necessary to practice and promote argumentation in the mathematics classroom. In the present study, noticing of argumentation is conceptualized as a skillset comprising three interconnected components: attending, interpreting, and deciding how to respond, inspired by the research of Jacobs et al. (2010). Attending relates to identifying salient characteristics, both structural and dialogic, of argumentation in the classroom situation (McNeill & Pimentel, 2010). The structural aspect relates to recognizing the claims and their justifications according to the types of justification that are accepted in the classroom (Yackel & Cobb, 1996). The dialogic aspect relates to identifying the components of co-constructing arguments, critiquing peer arguments, respecting others' arguments, and working toward building a group consensus (Asterhan & Schwarz, 2016; McNeill & Pimentel, 2010; Mueller et al., 2012). Interpreting is associated with reasoning and making sense of the argumentation in the classroom setting, as well as considering factors that might either enable or inhibit argumentation, such as task characteristics, cognitive and affective student characteristics, socio-cultural characteristics, and teaching strategies (e.g., Staples, 2014; Yackel & Cobb, 1996). Lastly, deciding how to respond relates to what a teacher would do, toward fostering argumentation in the given situation. Drawing on previous research (e.g., Ayalon & Wilkie, 2020; Topping, 2010), we investigate the potential use of peer-assessment strategies to develop secondary-school mathematics teachers’ (SMTs) noticing of argumentation.

A cohort of 61 Israeli SMTs participated in the study as part of a master's degree course focusing on analysis of argumentation classroom situations (ACSs), which serve as both a pedagogical and a research tool. An ACS is a written representation of a real-life instructional situation in the mathematics classroom which provides teachers with opportunities to attend to structural and dialogic aspects of argumentation. ACSs also allow teachers to offer interpretations for the argumentation sequence in the situation and to address factors that appear to enable or inhibit the argumentation. Throughout the course, the SMTs participated in three peer-assessment cycles comprised of (a) individually analyzing an ACS using a report format, (b) collaboratively assessing peers' ACS-reports using an ACS rubric and providing feedback to peers, (c) receiving feedback from peers and individually refining the initial ACS-reports, and (d) reflecting on their experience. This paper focuses on the first cycle. We aim to explore changes in SMTs’ noticing of argumentation through experiencing peer-assessment strategies and to gain insights into the aspects that supported or inhibited these changes.

Theoretical background

In developing this study, we drew on the educational research literature on argumentation, teaching for argumentation, and teacher noticing. These are elaborated on in the following sub-sections.

Argumentation

Numerous definitions of and approaches to argumentation appear in the education literature (Schwarz & Baker, 2017). The theoretical perspective for the approach taken in this paper views argumentation as a social process, situated, in our case, in the social norms of the classroom. We therefore follow van Eemeren’s and Grootendorst’s (2004) definition according to which argumentation is “a verbal, social, and rational activity aimed at convincing a reasonable critic of the acceptability of a standpoint by putting forward a constellation of propositions justifying or refuting the proposition expressed in the standpoint” (p. 1). Argumentation, by this definition, entails producing claims, delivering supportive proof to validate these claims, and finally, assessing the validity. By this definition, argumentation is posited in a social space and, when infused into the classroom discourse, it affords a place for students to articulate and critically evaluate alternative ideas, which ultimately supports the construction of collaborative knowledge (Asterhan & Schwarz, 2016). This definition lays the groundwork for common descriptions of ''deliberative argumentation'' that have proven to be exceptionally ‘fruitful’ for learning (Felton et al., 2009). This form of argumentation features learners' collaborations on constructing arguments, listening to others’ ideas critically and, respectively, identifying the pros and cons in each idea, and striving to reach a consensus (Asterhan & Schwarz, 2016, p. 167).

The present study regards argumentation as having two important and interrelated aspects—structural and dialogic (Jiménez-Aleixandre & Erduran, 2008; McNeill & Pimentel, 2010). The structural aspect focuses on discourse in which a claim (the assertion of which an individual attempts to persuade another) put forward as an idea, solution, conclusion, hypothesis, etc., is supported by an appropriate justification. In the mathematics classroom, a justification's appropriateness is determined by the prevalent socio-mathematical norms (Yackel & Cobb, 1996).

Coupled with the structural aspect, the dialogic aspect deems argumentation as the verbal discourse between learners as they generate ideas and critique those of their peers (Asterhan & Schwarz, 2016; McNeill & Pimentel, 2010; Mueller et al., 2012). Mathematics is commonly viewed as a social enterprise whereby the community of mathematicians share established norms of argumentation for advancing mathematical knowledge (Davis & Hersch, 1981). The dialogic aspect corresponds with this view. In the mathematics classrooms, this can take the form of students listening to each other, building upon their peers' ideas, and critiquing those ideas with respectful tone and substance as the group collaborates toward reaching a consensus (Mueller et al., 2012). In this study, the combined structural and dialogic dimensions are considered essential properties of argumentation classroom situations. In our view, paying attention to both aspects of argumentation can help teachers incorporate argumentation more beneficially into their classroom practice (Jiménez-Aleixandre & Erduran, 2008; McNeill & Pimentel, 2010).

Teaching for argumentation

Mathematics teaching that encourages argumentation can provide opportunities for students to take an active role, i.e., to construct arguments, share, consider others' ideas, and critically evaluate the validity of those ideas, while adhering to normative aspects of mathematical discourse that are specific to the students’ mathematical activity (Ball & Bass, 2003; Yackel & Cobb, 1996). In the literature, various factors are associated with teaching that creates opportunities for students to participate in argumentation (Mueller et al., 2014; Staples, 2014; Yackel, 2002; Yackel & Cobb, 1996). Drawing on the literature, we focus on four key factors: the nature of the mathematical tasks, the teaching strategies used, the students’ cognitive and affective characteristics, and the socio-cultural characteristics. Further discussion can be found in Ayalon and Nama’s paper (First online).

Task characteristics

The nature of the mathematical tasks selected and implemented is profoundly associated with mathematics classroom argumentation (e.g., Ayalon & Hershkowitz, 2018; Mueller et al., 2014). For example, tasks that invite multiple representations and strategies for solutions (Francisco & Maher, 2011; Mueller et al., 2014; Solar et al., 2020) afford building various claims and justifications (Bieda et al., 2014) and provide the opportunity for students to collaborate on seeking alternative ideas, discussing differences in viewpoints, and critiquing ideas (Mueller et al., 2012).

Teaching strategies

Teachers’ actions are fundamentally associated with mathematics classroom argumentation (e.g., Ayalon & Even, 2016; Conner et al., 2014; Staples, 2007). For example, encouraging students' participation and using questions that foster raising, discussing, and evaluating ideas (Solar et al., 2020); valuing arguments that address the whys of results and not just the results themselves, as well as preparing the ground for what counts as an acceptable justification in the classroom (e.g., Staples, 2014; Yackel, 2002; Yackel & Cobb, 1996); explicating the main ideas of mathematical argumentation (e.g., the use of counterexample to refute a claim) and elucidating the argumentative basis of students’ claims; creating opportunities for students to co-construct arguments and critically consider each other's ideas in a way that promotes mutual respect (Kosko et al., 2014; Mueller et al., 2014; Nathan & Knuth, 2003) by posing questions like, “Do you agree or disagree, and why?"; “Is this always true?"; "What might somebody say to oppose that?" (Asterhan & Schwarz, 2016); and praising students who express their own ideas or challenge the ideas of others.

Students’ characteristics

Argumentation activities are considered cognitively and emotionally demanding (Slakmon & Schwarz, 2019). In cognitive terms, being sensitive, for example, to students' ways of mathematical thinking (e.g., student's tendency to produce arguments based on examples instead of deductive arguments (e.g., Chazan, 1993), students’ prior knowledge, common mistakes (Asterhan & Schwarz, 2016; Knuth & Sutherland, 2004), and argumentation skills (e.g., collaborating on constructing arguments, critiquing and questioning ideas, and revising their own ideas based on the discussion) (Asterhan & Schwarz, 2016; Mueller et al., 2012; Stein & Albro, 2001). In affective terms, being sensitive to students' emotions, self-confidence, interest, and joy (Ayalon et al., 2022; Slakmon & Schwarz, 2019). For example, facing opposition from other participants, failing to comprehend others’ arguments, asserting ideas without attempting to reach a consensus, or being ignored are all integral to argumentation and may contribute to negative emotions (Ayalon et al., 2022; Stein & Albro, 2001).

Socio-cultural characteristics

Socio-cultural characteristics can affect the mathematics classroom argumentation. For example, the social nature of educational aspects related to school and institutional norms can affect the teacher's classroom practices, such as textbook use, curriculum standards, and high-stakes tests (e.g., Ball & Forzani, 2009; Chazan et al., 2016); the classroom social norms, such as recognizing the value of argumentation and expectations for critique, collaboration, and mutual respect (e.g., Martino & Maher, 1999; Mueller et al., 2014; Yackel, 2002); and the socio-mathematical norms related to the kinds of justifications accepted in the classroom (Yackel & Cobb, 1996).

The teacher's role in encouraging argumentation is not simple; it entails considering various aspects, such as the mathematics involved, student thinking, given tasks, and socio-cultural features (Ayalon, 2019; Ayalon & Nama, First online; Staples, 2014). Research shows that teachers may encounter difficulties in incorporating argumentation into classroom practice which engages students in constructing and responding to arguments (Bieda, 2010; Conner et al., 2014; Zhuang & Conner, 2022). Moreover, teachers' interpretation of facilitating mathematical argumentation can be misaligned with what reformers in mathematics education envision; i.e., mistakenly believing that mathematical argumentation can occur with relatively little scaffolding by the teacher (Kosko et al., 2014). Investigating how to devise effective professional learning for enhancing argumentation in the mathematics classroom is therefore an important goal. Toward achieving it, this study builds upon literature engaged in teachers' noticing to further explore and develop teachers' noticing-of-argumentation. Specifically, this study focuses on teachers' noticing of structural and dialogic aspects of argumentation, and on factors associated with teaching that creates opportunities for students to participate in argumentation, meaning, task characteristics, teaching strategies, student characteristics, and socio-cultural characteristics.

Teacher noticing

Noticing, a term used in everyday language, signifies the act of observing or recognizing something. However, certain professions have specific ways of noticing. Understanding and promoting productive noticing by mathematics teachers is a fast-growing area of research (Jacobs et al., 2010; Sherin et al., 2011a; van Es et al., 2017). Mathematics education currently encompasses divergent conceptualizations of noticing (König et al, 2022; Scheiner, 2021). As do many other researchers, we see noticing as consisting of three interrelated skills. These include: attending to noteworthy features of instruction, interpreting them while taking on different perspectives to gain a deeper insight into what is being observed, and deciding how to respond (Jacobs et al., 2010). Accordingly, we position ourselves, by and large, as taking a cognitive perspective to noticing (König et al, 2022; Scheiner, 2021). Noticing has been deemed a pivotal component of mathematics teachers' expertise and is essential for enhancing teachers’ refection on their teaching aimed at improving their practices (e.g., Star et al., 2011). Noticing skills are particularly essential to instruction in which teachers make quick decisions simultaneously, while also attending to students’ mathematical thinking and utilizing students' ideas to contribute to developing the lesson's content further (Santagata, 2011; Sherin et al., 2011a).

The complex mathematics classroom environment is a space where multiple things happen concurrently, and teachers are largely incapable of attending equally to all this wealth of activity (Sherin et al., 2011b). Therefore, what teachers do notice may not always be beneficial and may fail to optimize critical opportunities that emerge to further develop and refine the lesson. Consequently, they must learn to filter through that complexity and decide where to devote their instructional attention and efforts. Research focusing on the development of teachers' noticing skills indicates shifts in what and how teachers are noticing; specifically, in the progression from describing technical aspects of teaching to more detailed noticing with interpretations of teachers' pedagogy and students' thinking (van Es, 2011). The literature also points out differences in how teachers have shifted their noticing, in terms of essence, time taken, extent, consistency, and sustainability (e.g., van Es et al., 2017).

Numerous studies have proposed training programs that provide mathematics teachers with appropriate settings in which to practice noticing, commonly using written classroom situations (e.g., Rotem & Ayalon, 2023; Scherrer & Stein, 2013) or video excerpts (e.g., González & Skultety, 2018; Sherin & van Es, 2009) as a tool for measuring progress. Some researchers have focused on teachers' noticing of their students' mathematical thinking (e.g., Bas-Ader et al., 2021; van Es & Sherin, 2008), while others have underlined specific criteria, e.g., practices such as justification and generalization in mathematics (e.g., Melhuish et al., 2019, 2020). Melhuish et al. (2020) focused on characterizing elementary school teachers’ noticing of structural aspects of argumentation, namely the mathematical content and reasoning form. Our study extends these studies by focusing on SMTs' attention to both structural and dialogic aspects of argumentation, interpretation of the argumentation through multiple lenses, and deciding how to respond. We aim to investigate and develop SMTs' particular type of noticing, which we call: noticing of argumentation (Ayalon, under revision).

Conceptualization of teachers’ noticing of argumentation

Following Jacobs et al. (2010), and based on the educational literature on argumentation summarized above, we conceptualize noticing of argumentation as a set of three interrelated skills: attending, interpreting, and deciding how to respond (Ayalon, under revision). The study undertook a theoretical perspective upon which argumentation is ‘fruitful’ for learning, characterized by students collaborating on constructing arguments, critically and, respectively, listening to others’ ideas, and working toward consensus-building (Asterhan & Schwarz, 2016, p. 167). Attending relates to identifying salient characteristics, both structural and dialogic, of argumentation in the classroom situation (McNeill & Pimentel, 2010). The structural aspect focuses on the proposed claim and justification for the claim, and in our context, in accordance with the accepted types of justification in the classroom community (Yackel & Cobb, 1996). The dialogic aspect is related to these components: co-constructing of arguments, critiquing peer arguments, mutual respect, and working toward consensus-building (Asterhan & Schwarz, 2016; McNeill & Pimentel, 2010; Mueller et al., 2012). Similar to the work of Jacobs et al. (2010), who looked for evidence of participants’ attending to student mathematical thinking, our work, too, examined teachers’ responses, particularly for evidence of attention to structural and dialogic aspects of argumentation. Interpreting relates to reasoning and making sense of the argumentation in the classroom situation, considering factors that may enable or inhibit the argumentation, including task characteristics, teaching strategies, cognitive and affective student characteristics, and socio-cultural characteristics (e.g., Staples, 2014; Yackel & Cobb, 1996). Similar to the research of Jacobs et al. (2010), who looked for evidence of participants’ interpretation of student mathematical thinking, our work, too, looked at teachers’ responses specifically for evidence of interpretation of the argumentation. Finally, deciding how to respond relates to what one would do, assuming s/he was the teacher in that situation, toward fostering argumentation. Similar to the work of Jacobs et al. (2010), who looked for evidence of considering the specific child's strategy and responding in a way that is likely to further this child's understanding, in our work we searched for evidence of responding in a way that supports students’ engagement in argumentation in the given situation.

Figure 1 summarizes our conceptualization of argumentation in the mathematics classroom and of noticing-of-argumentation. We employed this framework in building the research tool and in analyzing the data to explore the change in SMTs' noticing of argumentation, including the aspects they attend to, with specific reference to structural and dialogic aspects of argumentation, the factors that may have enabled and/or inhibited the argumentation addressed in their interpretation, and the alternatives to the teaching strategies proposed.

Fig. 1
figure 1

Our conceptualization of argumentation in the mathematics classroom and the components of noticing of argumentation (Ayalon, under revision)

Argumentation classroom situation (ACS)

It is widely accepted that a classroom situation using written transcripts and videos, for instance, can be used to represent a specific instructional situation designed to elicit teacher noticing (e.g., Scherrer & Stein, 2013; van Es et al., 2017). Whereas some studies on teacher noticing aim solely to describe what teachers notice, the majority of studies use, either implicitly or explicitly, an accepted frame of reference that dictates what teachers should notice (e.g., Stockero & Rupnow, 2017). This study defines an argumentation classroom situation (ACS) as a real instructional situation that occurred in the mathematics classroom and shows considerable potential for learning to notice argumentation. In accordance with van Eemeren and Grootendorst's definition (2004), ACS is a mathematical-focused interaction among students and a teacher in which students raise claims and justifications, build on each other's ideas, and critique ideas as the class moves toward consensus-building. Although this definition requires further refinement, it does, nevertheless, provide the essence of criteria in terms of structural (claims, justifications, types of justification), and dialogic (co-constructing of arguments, critiquing arguments, mutual respect, working toward consensus-building) aspects for what to attend to in an ACS. In addition to utilizing both aspects of argumentation, the ACS enables offering rich interpretations to the argumentation from different perspectives which consider diverse factors that might have enabled or inhibited it and proposes alternative teaching strategies.

Teachers' professional learning

Mathematics teachers' professional training is perceived as focusing on situation-specificity (Borko et al., 2011). It should not be inferred that positing professional learning opportunities in practice means that a teacher's classroom is necessarily the only space where professional development can occur; rather, it involves drawing upon key instructional practices to generate opportunities for teacher learning (Borko et al., 2008). From a research perspective, encouraging teachers to collaborate in response to mathematically and pedagogically specific situations inherent in their teaching can yield important insights into teachers' conceptions and intended practices (Biza et al., 2007). Thus, there is potential, in terms of both research and teacher education, in engaging teachers in realistic classroom scenarios (Biza et al., 2007).

In the present study, the teachers experienced a sequence of activities in a peer-assessment cycle (see Fig. 2):

  • Analyzing the ACS “Abbreviated multiplication formulas” using the written report format;

  • Assessing ACS peers' written reports using the ACS rubric format;

  • Refining the initial ACS reports; and

  • Reflecting on their experience.

The strategy of teachers analyzing classroom situations has been advocated in studies of teachers' professional learning as a way to engage teachers in a cycle of experimentation and reflection. We sought to create similar opportunities for teachers' learning by providing them with ACSs and asking them to analyze them, along with their peers' ACS written reports, for assessing the quality of the works, using a rubric format. These ACSs served as artifacts to analyze, compare, and reflect on, so as to facilitate the teachers' noticing of a wider array of argumentation characteristics. Even though this approach of engaging teachers in analyzing other teachers' ACS reports is novel, it still resonates with theoretical perspectives on learning which focus on teachers' “participation in socially situated practices” (Lave, 1996, p. 150). By incorporating a cycle of analyzing an ACS by using an ACS-report format and refining the ACS-reports by utilizing peer feedback, we sought data that provide insights into any changes in the teachers’ noticing-of-argumentation. In addition, written reflections were elicited from the teachers at the conclusion of the activity, and semi-structural interviews were conducted with 20 teachers to find any evidence of outcomes which the teachers themselves found salient.

Proposing peer-assessment as an external domain tool for learning

In peer-assessment, learners are expected to consider and specify the level, value, or quality of a product or performance of other equal-status learners (Topping & Ehly, 1998). Studies, particularly those focusing on higher education, have largely engaged in the reliability and validity of peer assessment for summative grading of [college undergraduate] course assignments, as compared to assessment by the course instructor. Interest in peer-assessment, perceived as a complementary component of formative assessment practices, has been growing steadily (Black & William, 2009; Topping, 2010). Several studies exploring qualitative peer feedback have found evidence of its potential for effective learning. Peer-assisted learning may provide more immediate, timely, and individualized results than teacher feedback (Topping, 2010). It has also been found to develop self-regulation and metacognition, improve learners' communication skills, instill a better understanding of the assessment criteria, and increase self-awareness of one’s own work quality (Brown & Harris, 2013; Sadler, 1998; Topping, 2010). When learners analyze their peers' work, they can access a wide variety of others' attempts and examples that help them better notice gradations in quality (Topping, 1998). Moreover, since they did not create the work themselves, they tend to view it from a more remote perspective which facilitates analyzing it more objectively, as compared to subjective self-assessment (Black et al., 2003). Peer feedback has also been found to have potential benefits for learning because it is qualitatively different from conventional teacher feedback. Not having an unequivocal ‘knowledge authority,’ coupled with the uncertainty prompted by a peer’s comparatively equal status, may induce learners to seek to validate the feedback, thus leading to further thinking, analyzing and discussion (Yang et al., 2006).

Much of the literature on peer assessment as a learning strategy in a teacher education context (e.g., Zevenbergen, 2001) has focused on teachers' development of Content Knowledge (e.g., mathematics problem solving). Recently, Ayalon and Wilkie (2020) explored the potential of using peer-assessment as a tool for cultivating pre-service mathematics teachers’ Pedagogical Content Knowledge (Shulman, 1986) and, in particular, formative assessment principles and practices. Their findings provided evidence as to the likely success of such a strategy to support participants in seeing formative assessment practices and their role, knowledge, and values in different and more complex ways. In this study, we conceptualize peer-assessment processes as a learning tool for enhancing teachers’ noticing of argumentation. In exploring the potential for such processes for improving SMTs' noticing of argumentation, we adapted Ayalon and Wilkie’s (2020) peer-assessment cyclical model, shared in the next section.

This study addresses the following two research questions:

RQ1 What, if any, change occurs in secondary mathematics teachers' noticing-of-argumentation, through experiencing peer-assessment strategies?

RQ2 What factors promoted or inhibited the change in secondary mathematics teachers’ noticing-of-argumentation, from the teachers’ point of view?

Research design

Peer-assessment cycle design

The current study draws on the model adapted by Ayalon and Wilkie (2020) (see Fig. 2) in an effort to help SMTs cultivate noticing-of-argumentation.

Fig. 2
figure 2

The study’s peer-assessment cycle (adapted from Ayalon & Wilkie, 2020)

Phase 1: Preparing a written report on ACS (initial ACS report)

The cycle began with teachers receiving a transcription of the ACS “Abbreviated Multiplication Formulas” (see Fig. 3 in the Research tools section). They were asked to prepare a written report using a format (adapted from Jacobs et al., 2010) that includes prompts related to the three skills of noticing (attending, interpreting and deciding how to respond) provided by the instructors (see ACS-report in the Research tools section). This activity took place during the first meeting of the cycle and lasted 90 min.

Phase 2: Peer analysis and feedback provision

The peer analysis phase involved making judgments about the quality of the work of a peer, with the help of the rubric designed by the research team for assessing ACS written reports (see ACS rubric format in the Research tools section) to generate elaborated and qualitative feedback and give constructive critique.

The design of the peer analysis phase drew on recommendations from previous research, which utilized peer-assessment activities in other contexts to achieve change in participants’ learning in line with their goals (Ayalon & Wilkie, 2020). Sitting in small groups of 3–4 participants, the SMTs were presented with 3–4 written reports produced individually by other peers, not from their group. Participants were asked to read each report, discuss it, and collaborate on assessing it using the rubric format. We know from the research literature that involving learners in discussing and assessing their peers’ work provides an important opportunity to be exposed to diverse ideas and examples, thereby supporting them in noticing variations and gradations in quality (Topping, 1998). Moreover, since they did not create the work themselves, it is more likely that they will view it from a more remote perspective which facilitates analyzing it more objectively, as compared to subjective self-assessment (Black et al., 2003). Our study participants were asked to provide a detailed explanation for their assessments, since receiving feedback that shows the “correct” answer with no explanation is likely to be of minimal value (Hattie & Timperley, 2007). Feedback that helps participants learn to analyze, critique, and improve their work independently is arguably of the greatest benefit (ibid.). Note that the participants received the ACS rubric without being provided with a tangible example of a high-quality analysis of the ACS by the researchers. The aim was for them to look at a peer’s work through the lens of the rubric levels and to discuss that work in order to stimulate their noticing of the different argumentation components.

This assessment experience was intended to support individuals in developing the ability to apply distanced objectivity to their own work (Black et al., 2003; Reinholz, 2016). Because the SMT is ‘a step removed’ from the particular work they are analyzing, they are more likely to notice any discrepancies. Such observation can become a lens that can later be applied to his/her own noticing-of-argumentation. In analyzing peer work, SMTs are exposed to a variety of examples, which helps them notice variations in quality as well (Sadler, 1998). This contrasts with presenting them only with a high-quality analysis of argumentative situation samples, which may make it difficult for them to determine the essence of what actually makes an analysis ‘good’.

Feedback provision engaged the SMTs in conveying feedback by writing their analyses for their peers using the rubric for assessing ACS written reports. The activities in this phase took place during the second meeting of the cycle and lasted 180 min.

Phase 3: Feedback reception and refinement of written reports on ACSs

Being given feedback helps individuals to perceive their work products and achievements from another person’s perspective and to attend to aspects of their work that might be problematic. After receiving the peer feedback, participants were asked individually to refine the reports produced in Phase 1 before submitting their final product. Knowing they will be expected to revise their work can influence the feedback that participants give others, as well as their interpretation of the feedback they receive (Reinholz, 2016). The refined report produced at Phase 3 was compared to the ACS report written in Phase 1 to ascertain whether and how being involved in the peer's assessment cycle resulted in enhancing the teachers' noticing of argumentation. We chose to use the same ACS in both the initial phase and in the third phase to allow for comparisons in what the SMTs noticed (Mitchell & Marin, 2015). We acknowledge the possibility that in using the same ACS in both phases, change in noticing may be also a result of reading the same transcript for a second time. However, using a different ACS in the final phase can affect the results as well, as different ACSs may have different affordances and limitations with regard to their stimulus for noticing. We therefore decided, in line with other studies (Mitchell & Marin, 2015; Santagata et al., 2007; Schack et al., 2013), to use the same ACS. The activities in this phase took place during the second meeting of the cycle and lasted 60 min.

Phase 4: Teachers' reflecting on the experience (individually)

As a final phase in the peer-assessment cycle, the SMTs were invited to complete a personal reflective questionnaire. The questionnaire focused on their experiences across the sequence of activities, their perceived strengths and challenges, the similarities and differences between the initial ACS report and the refined report, and what in their opinion caused them.

Participants and the context for the study

A cohort of 61 in-service secondary mathematics teachers participated in this study, which was conducted in Israel at the beginning of a university course focused on argumentation in mathematics teaching, as part of their fulfillment of a master's degree in mathematics education. All participants held a B.Ed. in mathematics education or a B.Sc. with a major in mathematics or a mathematics-related subject. The teachers’ experience ranged from 1 to 27 years, averaging 7 years; thirty-two participants had 1–5 years of experience, and twenty-nine participants had more than five years of experience; twenty-three teachers taught students aged 12–15, and thirty-eight teachers taught students aged 15–18. The teachers in this cohort had not been explicitly exposed to argumentation in their formal academic education.

The study took place during the fourth and fifth sessions of the course. The earlier three sessions focused on discussing theoretical issues related to structural and dialogic aspects of argumentation and to factors that contribute to shaping argumentation in the classroom. Following the three sessions, the teachers participated in the peer-assessment cycle. They were informed that the assessments made by their peers were intended for professional learning only and would not be used in determining their final grades in the course. As suggested to us by the ethics committee, at the end of the course, after the students' grades in the course had already been published, we asked the students for their consent to use their ACS written reports for our research. It seems that the participants were unaware of our implicit goal of exploring the change in their noticing in the course of the peer-assessment cycle. This suggests that in their refined analysis, they did not make changes in order to satisfy a certain perceived goal or expectation, but rather did so because of the value of the activity itself. All the participants consented to our request.

Research tools

Three main tools that served this study include: the argumentative classroom situation (ACS), the ACS written report format, and the rubric for assessing ACS reports. Below we describe each of them.

The argumentative classroom situation (ACS)

The peer-assessment cycle began with teachers receiving an ACS focusing on the issue of abbreviated multiplication formulas. The situation given took place in a 9th grade class of a teacher-colleague, who wrote it after class and gave it to us. As elaborated below, we found that the situation meets the criteria that characterize an ACS (see "Conceptualization of teachers’ noticing of argumentation" section). The first three columns in Fig. 3 present chronologically the speaker and her/his contribution in the context of the ACS, to support analysis and discussion. The right-hand column presents our identification of structural aspects of the argumentation and the dialogic aspects (in ellipses). (Note: SMTs received the ACS without the right-hand column).

To test our choice of the situation, we asked a group of eleven Ph.D. and M.A. students in mathematics education to analyze the classroom situation individually (by using the ACS-report format; see below) and then, share and discuss their analyses. Based on their analysis and discussion, we reached a final consensus that the situation meets the criteria of an ACS ("Conceptualization of teachers’ noticing of argumentation" section) and enables offering rich interpretations to the argumentation from different perspectives. We deemed the mathematics in this task to be suitable for all teachers teaching at the various age levels.

Fig. 3
figure 3figure 3

The “Abbreviated Multiplication Formulas” situation

The ACS-report format

Figure 4 presents the ACS-report format (adapted from Jacobs et al., 2010) that includes prompts related to the three skills of noticing argumentation.

Fig. 4
figure 4

The ACS-report format for SMTs to complete

The ACS rubric format

The rubric was designed as (1) a pedagogical tool for the SMTs to assess, at Phase 2 of the cycle, the ACS-reports written by their peers and (2) a research tool for analyzing the initial and subsequently refined ACS reports. The rubric-format (Table 1) was designed in accordance with the prompts included in the ACS-report format, related to the three skills of noticing argumentation.

Table 1 The ACS rubric format

For example, the level of detailed description of how each dialogic aspect is manifested in the ACS (in case of attending), or the level of evidence to support their interpretations for how a factor impacted the argumentation (in terms of interpreting).

The rubric was devised and piloted in a course centered around argumentation in mathematics teaching a year prior to the present study. Throughout that course, the teachers were engaged in analyzing several ACSs with the help of the ACS report. Similar to existing research that examined changes in teacher noticing (Jacobs et al., 2010; van Es et al., 2017), we sought to capture variations in the teachers’ reports to further develop the quality of the ACS rubric. To validate the rubric and our analysis, we used credibility and trustworthiness criteria (Lincoln, 1995) by sharing the rubric and some data with eleven Ph.D. and M.A. students in mathematics education. We reached a consensus for all components except for one, “Attending to structural aspects.” Initially, we defined level 1 as “Identified correctly some claims and justifications” and level 2 as "Identified correctly all claims and justifications." During the discussion, all participants concurred that these definitions do not refer clearly to our instruction to identify the justification types. We therefore refined the distinctions between the quality levels.

An example of a ‘high-level’ analysis of an ‘Abbreviated Multiplication Formulas’ situation using the rubric is presented as supplementary material. Note that prior to the research experience, the participants had been introduced to the ACS rubric format. Following a discussion on its components, the teachers used it to self-assess another ACS. Hence, we assume that participants understood the author's expectations from Phase 1.

Tables 2, 3, 4 and 5 present some examples from the teachers’ reports for the various quality levels of noticing.

Table 2 Quality levels of attending to structural aspects
Table 3 Quality levels of attending to the dialogic aspects
Table 4 Quality levels of interpreting
Table 5 Quality levels of responding

Data collection

Multiple data were collected, including: (i) SMTs’ ACS-reports focused on analysis of the ‘abbreviated multiplication formulas’ situation (see "The ACS-report format" section). Each SMT submitted a report at Phase 1 of the peer-assessment cycle (Initial ACS-report) and a refined report at Phase 3 of the cycle (refined ACS-report), for a total of 122 reports. The reports served as the main data source for characterizing the participants’ skills of noticing of argumentation, and the change in skills following their participation in the peer-assessment cycle (RQ1); (ii) written reflections, submitted at Phase 4 of the cycle, focused on SMTs’ experiences through the sequence of activities, their perceived strengths and difficulties, the similarities and differences between the initial ACS-report and the refined ACS-report and what caused them. The written reflections served as a source to identify the factors which promoted or inhibited the change in secondary mathematics teachers’ noticing-of-argumentation, from the teachers’ point-of-view. The SMTs were asked: (1) What do you see as your strengths in relation to the analysis of an argumentative situation? (2) What do you see as your weaknesses in relation to the analysis of an argumentative situation? (3) What is the difference between your initial analysis of the situation and your analysis following the peer-assessment process? What factors in your experience contributed (promoted or inhibited) to a change in your work? Please explain in detail (RQ2). (iii) individual, semi-structured interviews conducted by the first author (not the course instructor) with 20 SMTs (out of 61) were conducted to gain additional insights into the research findings and pinpoint the factors which promoted or inhibited the change in SMTs' noticing of argumentation, from their perspective. Thirteen interviews were conducted with SMTs who were found to have had a noticeable change in their noticing of argumentation, and seven interviews with SMTs for whom we did not find a noticeable change. The interviewees received their own reports and assessments, as well as the main findings relating to changes in SMTs' noticing-of-argumentation (initial versions of the tables that appear in the Findings section), several days before the interview to enable them to deeply reflect on the whole process. The interviews lasted approximately 90 min. The number of participating teachers (20) was determined to attain diverse explanations for the research findings. The main questions posed to the teachers were: (1) What stands out to you in the research findings? (2) Do the research findings make sense to you? In what ways? (3) Are there any findings that surprise you? What are they? Why? (4) Please provide possible explanations for the research findings from your perspective. What factors promoted or inhibited the change in teachers' noticing of argumentation, in your opinion? (5) What are the similarities and differences between your initial analysis of the situation and your analysis following the peer-assessment process? (6) What aspects of your experience contributed (promoted or inhibited) to the change? Please explain in detail. The teachers were encouraged to explain their responses in detail and provide examples throughout the interview.

Data analysis

For RQ1, the aim of the data analysis was to explore the change in the SMTs’ noticing of argumentation as reflected in their initial and refined ACS reports. With regard to participants’ attending to argumentation, for each ACS report we classified the participant’s responses according to the two aspects of argumentation: structural and dialogic. We first examined the participant’s response to the attending prompts of the structural aspects as they appear in the ACS report format (see Fig. 4). Responses classified as attention to structural aspects included attending to the elements of arguments, which include claims and justifications, and identifying the types of justifications. Responses classified as attention to dialogic aspects included: co-constructing of arguments, critique of arguments, mutual respect, and working toward consensus-building. With regard to participants’ interpreting of argumentation, for each ACS-report we classified the participant’s responses according to the content of the interpretation, i.e., the factors through which the argumentation was interpreted (Task characteristics, Teaching strategies, Student cognitive characteristics, Student affective characteristics and Socio-cultural characteristics). With regard to participants’ deciding how to respond, for each ACS-report we classified the participant’s responses according to the deciding-how-to-respond prompts as they appear in the ACS-report format (see Fig. 4).

For each SMT's initial and refined ACS report, the researchers employed directed content analysis applying the quality levels of the three noticing skills: attending to structural and dialogic aspects (for each of the four dialogic aspects), interpreting of argumentation (for each of the five factors), and deciding-how-to-respond as presented in the rubric format ("The ACS rubric format" section). The scoring for each participant was given as follows: scores were given according to the number of quality levels for each skill (for example, a score of 2 for level 2). For attending to structural aspects of argumentation, scores ranged from 1 to 2 (see examples of coding of teachers’ responses in Table 2). For attention to dialogic aspects and deciding how to respond, scores ranged from 1 to 3 (see examples in Tables 3 and 5). For interpreting of argumentation, scores ranged from 1 to 4 (see Table 4).

To determine whether a change occurred in SMTs’ noticing of argumentation, we applied nonparametric methods due to the ordinal nature of variables examined. The following statistical tests were used: The McNemar's test used on paired nominal data was employed to determine whether a change occurred in SMTs' scores for attending to structural aspects of argumentation, since only two scores were used (scores 1 and 2). For each of the three remaining skills, with regard to attending to dialogic aspects (For each of the four aspects), interpretation (For each of the five factors), and deciding how to respond, the nonparametric Wilcoxon Signed-Rank test, which is used to compare two ordinal dependent samples which are on an ordinal scale, was employed to determine whether a change occurred in the SMTs' scores between the initial report and the refined ACS reports. The effect size for Wilcoxon Signed-Rank tests was also calculated. The effect size of the data obtained by Wilcoxon signed rank is calculated by the formula r = z/√N (Pallant, 2011). Cohen (1988) defined effect as “small, r = 0.1”, “medium, r = 0.3”, and “large, r = 0.5”.

For RQ2, we explored the factors contributing to (or inhibiting) changes in SMTs’ noticing of argumentation, from their point-of-view. We conducted interpretive and in-depth qualitative analysis on the SMTs’ written reflections (n = 61) and interview transcripts (n = 20) (Creswell, 2007). Using inductive line-by-line coding of the teachers' written reflections and the interview transcripts, we looked for descriptions of the factors that shaped the change in SMTs’ noticing-of-argumentation. The analysis involved iterations of sorting the data and continual comparisons between the data and the developing categories, as well as across the categories themselves. Interpretations were discussed by the two authors of the paper, and cycles of check-coding were used until consensus was reached (Miles & Huberman, 1994). This process resulted in a coding scheme with eleven themes grouped into three main types: (1) Activity factors: seven themes related to factors associated with the peer-assessment experience, which according to the SMTs contributed to their noticing of argumentation; (2) SMT factors: three themes related to SMT factors, which according to them, enabled, but also constrained their noticing of argumentation; and (3) Contextual factors: one theme related to the specific ACS characteristics, as described in "Thematic analysis of SMTs’ written reflections and interviews" section.

Findings

The first four sections focus on findings relating to changes in SMTs' noticing-of-argumentation (RQ1). The fifth section focuses on factors contributing to (or inhibiting) changes in SMTs’ noticing-of-argumentation, from the SMTs’ perspective (RQ2).

Overall, the results provide evidence of improvement in the SMTs’ noticing of argumentation (i.e., attending, interpreting, and deciding how to respond) following their participation in the peer-assessment process. Importantly, their attention to dialogic aspects became more versatile, meaning that they attended to more aspects with a detailed description of how each aspect is demonstrated in the ACS. Likewise, their interpretations became more versatile and more evidence-based, as did their enhanced skill of deciding how to respond.

Change in SMTs’ Attending to structural aspects of argumentation

To determine whether there was a change in SMTs' attending to structural aspects of argumentation, we used McNemar’s test to compare scores between the Initial and Refined ACS-reports. The results, indicating statistically significant change (p = 0.001), are displayed in Table 6.

Table 6 Distribution of scoring of SMTs' Attending to structural aspects of argumentation, initial and refined ACS-report

As shown in Table 6, 18% of SMTs increased their Attending to structural aspects of argumentation score from level 1 to level 2, where the rate of decrease is 0.

Change in SMTs' Attending to dialogic aspects of argumentation

Our research sought to identify changes in SMTs' attending to dialogic aspects of argumentation. Wilcoxon Signed Ranks Tests were employed to determine whether a change occurred in SMTs attending to each one of the four dialogic aspects of argumentation: co-constructing of arguments, critiquing arguments, mutual respect, and working toward consensus-building, as reflected in the Initial and Refined ACS-reports. All these aspects were measured on a 1–3 scale. The results, indicating statistically significant changes in all four dialogic aspects, are displayed in Table 7.

Table 7 Four dialogic aspects of argumentation (scale 1–3) Wilcoxon test and effect size for Initial ACS-report and refined ACS-report and percentage of increase

As shown in Table 7, a significant change in attending to the four dialogic aspects occurs between the Initial and Refined ACS-reports: Co-constructing arguments change from the Initial ACS-reports (Mdn = 3) to the Refined ACS-reports (Mdn = 3), z = −3.58, p < 0.001, with a medium effect size, r = 0.46, where 25% of SMTs increased their score; Critique arguments change from the Initial ACS-reports (Mdn = 2) to the Refined ACS-reports (Mdn = 3), z = −3.64, p < 0.001), with a medium effect size, r = 0.47, where 26% of SMTs increased their score; Mutual respect change from the Initial ACS-reports (Mdn = 2) to the Refined ACS-reports (Mdn = 3), z = −4.40, p < 0.001), with a large effect size, r = 0.56, where 36% of SMTs increased their score; Working toward consensus-building change from the Initial ACS-reports (Mdn = 2) to the Refined ACS-reports (Mdn = 3), z = −4.51, p < 0.001), with a large effect size, r = 0.58, where 41% of SMTs increased their score.

We were also interested to see whether a change occurred in the number of dialogic aspects attended to by the SMTs between the Initial and Refined ACS-reports. We focus only on levels 2 and 3, as level 1 indicates no attention at all. Table 8 presents the distribution of teachers according to the number of aspects they attended to (0–4) in the Initial and Refined ACS-reports.

Table 8 A cross-analysis of the number of dialogic aspects of argumentation addressed by each SMT

As shown in Table 8, the teachers’ attention to dialogic aspects improved following the peer-assessment process. A decrease (from Initial ACS-report to Refined ACS-report) is evident in the number of SMTs who attended to 0, 1, 2 and 3 aspects and a major increase in the number of SMTs who attended to all four aspects.

Change in SMTs' interpreting

Our research sought to identify changes in SMTs' skills of interpreting the argumentation in the situation through different lenses, as reflected in the Initial and Refined ACS-reports. The different lenses represent the factors that might contribute to shaping the argumentation (enabling or inhibiting), including task characteristics, teaching strategies, student cognitive and affective characteristics, and socio-cultural characteristics. Note that the SMTs were given the opportunity to attend to additional factors other than the ones mentioned (see Fig. 4); however, they did not do so. Wilcoxon Signed Ranks Tests were employed to determine whether a change occurred in SMTs' interpreting. The results, displayed in Table 9, indicate a statistically significant change in SMTs' skills of interpreting the argumentation through different lenses between the Initial and Refined ACS-reports: task characteristics, teaching strategies, student cognitive and affective characteristics, and socio-cultural characteristics.

Table 9 Skills of interpreting the argumentation (scale 1–4) Wilcoxon test and effect size for Initial ACS-report and refined ACS-report and percentage of increase

As shown in Table 9, a significant change occurred in SMTs' interpreting the argumentation through different lenses between the Initial and Refined ACS-reports: task characteristics change from the Initial ACS-reports (Mdn = 2) to the Refined ACS-reports (Mdn = 3), z = −4.53, p < 0.001, with a large effect size, r = 0.58. Furthermore, 43% of SMTs increased their score; Teaching strategies change from the Initial ACS-reports (Mdn = 3) to the Refined ACS-reports (Mdn = 4), z = −3.94, p < 0.001, with a large effect size, r = 0.50, where 30% of SMTs increased their score; Student cognitive characteristics change from the Initial ACS-reports (Mdn = 3) to the Refined ACS-reports (Mdn = 3), z = −4.45, p < 0.001, with a large effect size, r = 0.57, where 39% of SMTs increased their score; Student affective characteristics change from the Initial ACS-reports (Mdn = 1) to the Refined ACS-reports (Mdn = 3), z = −4.77, p < 0.001), with a large effect size, r = 0.61, where 48% of SMTs increased their score; Socio-cultural characteristics change from the Initial ACS-reports (Mdn = 1) to the Refined ACS-reports (Mdn = 3), z = −5.02, p < 0.001, with a large effect size, r = 0.64, where 53% of SMTs increased their score.

Figure 5 illustrates the percentage of addressing the different factors when interpreting the argumentation in the ACS (levels 3 and 4) between the Initial and Refined ACS-reports.

Fig. 5
figure 5

Skills of interpreting the argumentation in the situation through different lenses (levels 3 and 4), initial and refined acs-reports

As shown in Fig. 5, the SMTs demonstrated strength in the Initial ACS-report by offering an interpretation for how the teaching strategies and the student cognitive characteristics might have shaped the argumentation in the situation. Most SMTs reached high levels (3&4) with respect to the teaching strategies (85%) and student cognitive characteristics (62%). By contrast, SMTs exhibited some difficulty in offering an interpretation for how the student affective characteristics, socio-cultural aspects and task characteristics might have shaped the argumentation in the situation. About two-fifths of SMTs reached high levels with respect to task characteristics, and several SMTs reached high levels with respect to student affective characteristics (15%) and socio-cultural characteristics (13%). In the refined ACS-report, the SMTs demonstrated high levels of interpretation for the teaching strategies, student cognitive characteristics and task characteristics. SMTs reached high levels (3&4) with respect to the teaching strategies (97%), student cognitive characteristics (85%) and task characteristics (69%) factors. By contrast, roughly half of the teachers reached high levels of interpretation when addressing the student affective characteristics (54%) and socio-cultural characteristics (54%) factors.

Table 10 displays findings from a cross-analysis of the 61 SMTs in the Initial ACS-report and Refined ACS-report analysis for the number of factors they addressed (0–5) in their Initial and Refined ACS-reports. We focus here on high levels-based evidence (3 and 4).

Table 10 A cross-analysis of the number of factors (enablers/inhibitors) addressed by SMTs in their interpretation

As shown in Table 10, the SMTs' interpretation skills improved following the peer-assessment process. We see a decrease in the number of SMTs who addressed 0, 1, 2 and 3 factors in their interpretation to the argumentation, and a major increase in the number of SMTs who addressed 4 and 5 factors in the Refined ACS-report.

Change in SMTs' deciding how to respond skills

The Wilcoxon Signed Ranks test was employed to determine if a change occurred in SMTs deciding how to respond skill, as reflected in the Initial and Refined ACS-reports. The results, indicating a statistically significant change, are displayed in Table 11.

Table 11 Deciding how to respond skill (scale 1–3), Initial ACS-report and Refined ACS-report

As shown in Table 11, a significant increase occurred in SMTs' skills of deciding how to respond between the Initial ACS-reports (Mdn = 2) to the Refined ACS-reports (Mdn = 3), z = −5.2, p < 0.001), with a large effect size, r = 0.67, where 44% of SMTs increased their score.

Thematic analysis of SMTs’ written reflections and interviews

This section examines the factors contributing to (or inhibiting) changes in SMTs’ noticing-of-argumentation, from their perspective (RQ2). The analysis of the written reflections and interview transcripts yielded a coding scheme with eleven themes grouped into three main types: (1) Activity factors: seven themes related to factors associated with the peer-assessment experience which, according to the SMTs, contributed to their noticing of argumentation; (2) SMT factors: three themes related to SMT factors which, according to them, affected their noticing of argumentation; and (3) Contextual factors: one theme related to specific ACS characteristics.

Themes related to contributing aspects associated with the peer-assessment experience

Exposure to a diversity of peer reports, discussing their assessment with peers, and the assessments received, contributed, according to the SMTs, to their skills of noticing of argumentation. Table 12 presents the seven themes relating to this category.

Table 12 Themes related to factors associated with the peer-assessment experience

Themes related to contributing aspects associated with SMT factors

Table 13 presents themes related to SMT factors which, according to them, enabled, but also impeded their noticing of argumentation.

Table 13 Themes related to SMT factors

Themes related to contributing aspects associated with the specific ACS characteristics

Table 14 presents a theme related to the specific ACS characteristics which, according to the SMTs, enabled but also restricted certain opportunities for addressing some aspects of argumentation.

Table 14 A theme related to the specific ACS characteristics

Discussion

This study focused on the changes in secondary mathematics teachers' noticing of argumentation through engagement in peer-assessment strategies. In a peer-assessment cycle, they analyzed the “abbreviated multiplication formulas” situation using the ACS-report format, assessed peers' ACS-reports using the ACS rubric format, refined the Initial ACS-report after receiving feedback from peers, and reflected on self-perceived factors that contributed (promoted or inhibited) to the change in their noticing of argumentation.

The research results provide evidence of improvement in the SMTs’ noticing of argumentation (i.e., attending, interpreting, and deciding how to respond) following their participation in the peer-assessment process. In terms of teachers' attending, a noticeable change was found in attending to dialogic aspects of argumentation (co-constructing of arguments, critiquing arguments, mutual respect, and working toward consensus-building). Whereas about half of the teachers addressed some of these aspects in their Initial ACS-report, most teachers provided detailed description of how each of the four aspects was manifested in the situation following their participation in the peer-assessment process. Some increase was also found in the number of teachers who reached higher levels in their attention to structural aspects of argumentation; however, in this case the attention to structural aspects was already high from the outset. These important findings suggest that the teachers attained an impressively high level in the skill of “seeing” the mathematical claims and justifications in the ACS and in the participants' interactions once they had developed the ability to generate and critique each other’s ideas. These findings stand in contrast to previous results showing that elementary school teachers generally do not attend to justifying and generalizing in the manner recommended in the mathematics education literature (Melhuish et al., 2019, 2020). We suggest that our novel framework helps to better distinguish between the process of the argumentation practice (i.e., the dialogical aspects) and the product of the practice (i.e., the structural aspects) (Staples & Conner, 2022). It seems that possessing such a framework helps teachers to notice these differences as well. The incorporation of these aspects is central to argumentation deemed ‘fruitful’ for enhancing mathematical understanding and learning, and therefore, are important to be noticed (Francisco & Maher, 2005; Weber et al., 2008).

In terms of teachers' interpreting, a noticeable change was found in the SMTs' interpreting of the argumentation in the ACS from different perspectives: how different factors, such as the task characteristics, teaching strategies, student cognitive characteristics, student affective characteristics, and socio-cultural characteristics, may have shaped the argumentation. We also found a considerable increase in the number of factors addressed by each teacher in the Refined ACS-report, compared to the Initial ACS-report. Whereas most teachers addressed 0–3 factors in the Initial ACS-report, with teaching strategies and student cognitive characteristic being the most common, roughly half of the teachers addressed 5-4 factors in the Refined ACS-report. At this stage, beyond teaching strategies and student cognitive characteristics, task characteristics also stood out in the teachers' interpretations of the situation.

These important findings suggest that by participating in our study, the teachers attained a new and valuable skill of making sense of what they noticed in the ACS. In the literature, the different factors discussed by the teachers are associated with teaching that supports opportunities for students to participate in argumentation (e.g., Mueller et al., 2012; Staples, 2014). The teachers’ discussion of these factors and their role in the argumentation situation is therefore significant. It is also not obvious considering findings of past research. Ayalon and Hershkowitz (2018) explored what secondary mathematics teachers attended to when asked to choose mathematical tasks that they view as having potential to encourage class argumentation. Analysis of the explanations which teachers provided to justify their choice of tasks revealed that the attentiveness of most teachers was only partial or almost non-existent; they attended only to the task characteristics, or only to the social situation, or exhibited no attentiveness to any dimension whatsoever. In contrast to the current study’s participants, only a few teachers in Ayalon and Hershkowitz’s study (2018) addressed multiple factors related to integrating argumentation into the mathematics classroom.

The findings in the present study also reveal some difficulty in offering high-level interpretations for how students' affective characteristics and/or socio-cultural aspects may have contributed to the argumentation. Despite the improvement in addressing these factors following the peer-assessment process, only about half of the teachers addressed these factors and provided evidence for their claims. Being sensitive to students' affect, emotions, and self-confidence is important for teachers' ability to manage the activity (Ayalon et al., 2022; Knuth & Sutherland, 2004). The activities associated with argumentation, such as learning to voice arguments, exchange ideas, listen attentively to others, and critically evaluate the strengths and weaknesses of different perspectives, are emotionally demanding (Slakmon & Schwarz, 2019). Likewise, it is essential for student participation in argumentation to maintain a classroom environment governed by norms such as recognizing the value of argumentation and expectations for critique, collaboration, and mutual respect (Mueller et al., 2014; Yackel & Cobb, 1996). The social nature of educational aspects related to school and institutional norms can influence the teacher's classroom practices (e.g., Ball & Forzani, 2009; Chazan et al., 2016). Therefore, considering these aspects while making sense of the situation is of important value. One possible explanation for the teachers' relative difficulty in interpreting the situation through the lenses of student affective characteristics and socio-cultural aspects, which is supported by some of the participants’ reflections, may be related to the specific ACS characteristics they were working on, as is elaborated later.

As per teachers deciding how to respond, an improvement was found in the teachers' skill of providing alternatives with robust evidence to support their suggestions. Already at the beginning of the cycle, most of the teachers offered alternatives relevant to the situation. These focused primarily on encouraging students to participate in the argumentative activity and suggesting strategies to foster students’ mathematical thinking. At the same time, many teachers struggled to provide robust evidence to support their alternatives. Thus, their ideas largely remained at a general level, with no in-depth reference to how they would open up learning opportunities different from those occurring in the given situation. This skill improved following the peer-assessment experience. Research on teacher noticing demonstrates that teachers tend to offer disconnected or vague alternative teaching responses (Barnhart & van Es, 2015). Moreover, the research literature indicates that it is more challenging to develop responding skills in comparison with attending skills (e.g., Jacobs et al., 2010; Yang et al., 2021). The fact that a noteworthy proportion of teachers participating in the current study demonstrated an improvement in their skillset and awareness is very encouraging. Further study is needed to explore how to best encourage teachers to devise situation-based strategies that advance the argumentation in the lesson.

Our research design does not facilitate making unequivocal claims regarding the reasons for change in participants’ noticing of argumentation. However, analysis of the teachers' reflections attests to some of the self-identified factors that supported or impeded their noticing. From the teachers' responses, we learn that three main factors impacted their noticing of argumentation: peer-assessment experience, SMT factors, and characteristics of the specific ACS.

According to the teachers' responses, prominent were factors relating to their subjective experience during the peer-assessment process. Participation in group discussions exposed them to new knowledge on various aspects of argumentation and new terminology with which to discuss the argumentation in the situation. Furthermore, using a specific rubric for giving and receiving feedback—deemed crucial in effective formative assessment (Swan & Burkhardt, 2012)—seemed to support their noticing of argumentation. Through negotiation with peers about the rubrics and assessments, they noticed various details related to argumentation in the ACS, which they had not considered before, and attended more reflexively to their practice in interpreting the situation. These findings correspond with those of studies indicating that when learners analyze the work of others, they can access multiple examples that aid them in noticing nuances in the work quality (Topping, 2010). They can then discern flaws in their own work and generate ideas to improve it (Hattie & Timperley, 2007). Moreover, exposure to strengths and weaknesses in others' work motivated them, so they claimed, to devote efforts to reducing the inconsistencies in their work and improving their performance (ibid.). Interestingly, some teachers mentioned that the group discussion assessing peer reports was argumentative in nature, with participants raising differing viewpoints, critiquing others’ ideas, providing evidence for their claims, and voicing support or objections, thereby contributing to their internalization of the concept of argumentation.

Several teachers noted the specific ACS that enabled them to address several aspects of argumentation, while at the same time hindering the noticing of other aspects. Transcripts are advantageous in that they enable one to view move sequences in their entirety while allowing the analysis of each individual internal move in terms of its transcript sequence position (Scherrer & Stein, 2013), so it could be the best way to emphasize the salient features of discussion that we are interested in. In our case, it seems that the ACS we chose afforded the noticing of various aspects of argumentation that we were interested in. Still, some of the teachers reported that students' affective characteristics and socio-cultural aspects were not prominent in the given ACS situation. Further research is needed to examine the limitations of different ACSs in supporting teachers noticing of different aspects of argumentation.

Finally, from the teachers’ reflections we learn that SMT factors such as their own views on teaching and learning, their experience in analyzing argumentation and teachers' self-confidence played a role in shaping their learning experience. Schoenfeld (2011, p. 232) points out that “what teachers notice, and how they act on it, is a function of the teachers’ knowledge and resources, goals, and orientations.” This assertion is corroborated empirically, for example, by Meschede et al. (2017) who recognized that teachers’ noticing corresponds significantly with the pedagogical content knowledge and constructivist beliefs they possess with respect to learning and teaching. Consideration should be given to designing research interventions that promote such teacher factors with the goal of promoting their noticing skills.

The fact that the teachers regarded the activity as a positive experience is encouraging. The literature on undergraduate peer-assessment experiences emphasizes that peers may be reluctant to critique the work of others (Hanrahan & Isaacs, 2001) and to receive such critiques from others (Smith et al., 2002). Surprisingly, this study found no teachers who expressed any negative sentiments about this in their reflections. They did refer to disagreements but couched them in positive terms, as benefiting their learning. We think it is likely that in using peer-assessment for learning, the potential for negative affect or power issues (Topping, 2010) was alleviated. Huberman (1995), in discussing practicing teachers’ interactions, emphasized that even though some level of mutual affective comfort between teachers is necessary, it is likely to be successful in encouraging reflection on improving practice only if each teacher is committed to the work in which they are engaged. While they need not agree, they do need to be open to another’s perspective and attempt to understand it as much as possible.

Conclusion

The research findings contribute to the literature on professional learning, specifically on developing teachers' noticing of argumentation, by providing evidence of the potential of the peer-assessment strategy for teachers' learning of key aspects of argumentation practice. A valued but difficult-to-achieve goal in school mathematics is argumentation (Asterhan & Schwarz, 2016), and professional learning using classroom situations of argumentation and involving peer-assessment cycles may provide teachers with useful initial experiences to help them notice the various aspects associated with enhancing argumentation in the mathematics classroom.

Exploration of the engagement of this cohort of secondary mathematics teachers, even for a short duration, with one peer-assessment cycle, enabled us to consider some of the likely advantages and challenges associated with implementing peer-assessment as a learning tool in teachers' professional development courses. Moreover, our findings point at the potential of placing teachers as agents in their own learning, by having the teachers “playing” the researcher by using the researcher’s rubric. The emergent themes may support more extensive research on strategies to effectively develop teachers’ noticing of argumentation. Conducting longitudinal research into teachers' development of noticing of argumentation, over a longer time-period, with a greater variety of collaborative activities for analysis of teaching scenarios, is recommended. In our broader research to which this reported study is part of, we follow the SMTs' engagement in analyzing additional ACSs over time hoping to gain deeper insights on these issues. Finally, affording teachers diverse opportunities to experience engaging their students in argumentation and then, analyzing and reflecting on their teaching, would be valuable.

A possible limitation of this study relates to the use of the same ACS in both phases; hence, a shift in noticing may also be a result of reading the same transcript for a second time. However, using a different ACS in the final phase could potentially affect the results as well, as different ACSs may have different affordances and limitations with regard to their stimulus for noticing. Future research might explore this issue by engaging participants in analysis of ACSs—one group analyzing the same ACS in both phases, and another group analyzing different ACS in each phase. Another limitation of the study is that we do not know whether, or for how long, the improvement in noticing following the peer-assessment process will remain. For example, when the SMTS analyze a different ACS, will their noticing-of-argumentation skills be expressed? This issue is currently investigated in our broader research. Likewise, this study did not investigate teachers’ subsequent implementation of argumentation in their classroom practice. Yet an initial step toward encouraging classroom implementation is through engaging teachers in situations that prompt them to re-examine events and come to view them differently (Kennedy, 2016). In the present study, the peer-assessment cycle seemed to support the teachers in seeing more details in the argumentation situation and making sense of them in different and more complex ways. Further research is needed to explore the ways in which participation in such professional learning may be realized in classrooms. We are encouraged by some participants’ expressions of their desire to experiment with the argumentative activities in their classrooms. For example, one of the teachers named Aber said:

In my classes, I do not usually develop an argumentative dialogue. My experience in analyzing an argumentative situation and experiencing a peer-assessment process have helped me realize the importance of argumentation for developing the students' learning, understanding, and participation in mathematics classes. So as a teacher I have begun to develop this kind of activity in my classes.

In this study, we draw on our theoretical and empirically based conceptualizations of argumentation in the mathematics classroom and on research on teacher noticing in general to conceptualize teachers’ noticing of argumentation (Ayalon, under revision). In accordance with our conceptualizations, the artifacts used in this study were carefully designed with the goal of stimulating and developing teachers’ noticing of argumentation. These included the argumentation (Ayalon, under revision). classroom situation (ACS), the ACS-report format, and the ACS rubric. We acknowledge that both the theoretical constructs and the research tools used in this study may not be complete. Nevertheless, we hope that they may serve researchers and teacher educators as a basis for further improvement and refinement, for promoting and broadening teachers’ noticing of argumentation in the mathematical classroom. Teacher educators can, for example, engage teachers in exploring a series of ACSs focusing on different mathematical topics and originating from different classrooms (their own classroom and/or other teachers’ classrooms), using the written report format. Nowadays, we are working on analysis of data collected over time in a course focused on promoting teacher's noticing of argumentation through their engagement in extensive work on different ACSs from different topics in mathematics. Through these analyses, we hope to gain more insights about teachers’ development of noticing of argumentation as well as better understanding why engagement in cycles of ACSs’ analysis appears to be influencing teachers' noticing of argumentation.

Last but not least, to some extent our approach to argumentation may be not restricted to mathematics alone. Whereas our framework engages teachers in discussing the justifications and norms valued in the mathematics classroom, other aspects, such as the dialogic ones, are more discipline-general. This implies that researchers and teacher educators from other fields, such as science and history, could adapt our approach and tools to their own uses.