Introduction

Writing is an essential skill to communicate, both in daily life and in professional or academic contexts (Graham & Harris, 2013). For this reason, the acquisition of writing skills is one of the main aims of education (Psyridou, Tolvanen, Lerkkanen, Poikkeus, & Torppa, 2020). However, writing is a complex task in which the person has to deal with many different processes: planning (prepare the content activating previous knowledge and organizing the main ideas); transcription (grammatical encoding of the ideas retrieving syntactic and semantic knowledge) and, reviewing (Hayes, 2012; Kellogg, 2018).

Given the complexity of writing, it is not surprising that this skill is a demanding process for novice or young writers, especially when the transcription process (e.g., orthography, sentence construction) is not automatic (e.g., Limpo & Alves, 2013; Olive & Kellogg, 2002). If the transcription process is not automatic, students focus most of their cognitive resources on text production, and few resources are available for fundamental processes such as planning or reviewing (Rijlaarsdam et al., 2011). Even in older students (upper-primary and undergraduate students) researchers who have explored the processing time devoted to writing processes, have shown that the role of planning and reviewing to be minimal and the use of these cognitive processes to be inefficient (Beauvais, Olive, & Passerault, 2011; Limpo, Alves, & Fidalgo, 2014; López, Torrance, & Fidalgo, 2019; Torrance, Fidalgo, & Robledo, 2015).

In this context, an effective practice to help young writers deploy planning or reviewing processes and manage composition is to teach them to use strategies that reduce the cognitive overload that usually occurs when composing (Kellogg, 2018). In this regard, several meta-analyses have reported that one of the most effective approaches to improving the quality of students’ texts is strategy-focused instruction, either alone or combined with self-regulation procedures (Graham & Harris, 2017; Graham, McKeown, Kiuhara, & Harris, 2012). The relation between using self-regulatory procedures during writing (e.g., planning and goal setting, self-monitoring, self-instruction) and producing good text are well established in literature (Harris, Graham, Mason, & Saddler, 2002; Palermo & Thomson, 2018; Rosário et al., 2019; Saddler, Asaro-Saddler, Moeyaert, & Cuccio-Slichko, 2019).

One of the most successful instructional approaches that combines strategy-focused instruction and self-regulated procedures is Self-Regulated Strategy Development (SRSD) (Harris & Graham, 2017). The overall goal of SRSD is that students use the target strategies autonomously because the strategies are the key to achieving self-regulated performance (MacArthur, 2017). These strategies include process knowledge about such things as planning so that students establish procedural goals (“The first thing I have to do is plan my text, first I will make an outline of the ideas”), and discourse knowledge to ensure that students engage in product goal setting (“what should I include to make sure my text is adapted to audience needs?”) (Graham et al., 2012). The effectiveness of SRSD is well established across different educational stages, for students with and without learning difficulties, and for different textual typologies (Brunstein & Glaser, 2011; Festas et al., 2015; Palermo & Thomson, 2018; Rosário et al., 2019; Saddler et al., 2019).

Strategy-focused instruction is not a single technique, it has a multicomponent nature (Fidalgo, Harris, & Braaksma, 2017; MacArthur, 2017) which combines different instructional methods identified as effective evidence-based practices (e.g., Graham & Perin, 2007; Graham et al., 2012; Koster, Tribushinina, de Jong, & van den Bergh, 2015): (a) Direct teaching knowledge focusing on planning and/or reviewing, and knowledge about setting appropriate product goals for what the final content should be. Both types of knowledge are supported by strategies and mnemonics. (b) Modelling with Think aloud by the teacher who provides examples of these procedures and strategies in front of the class; and (c) peer or individual practice for students to emulate and practice these processes in a supportive context where the instructor guides and encourages them to achieve autonomy in writing (Graham, Gillespie, & McKeown, 2013). Although the efficacy of strategy-focused instruction is well demonstrated, this multicomponent nature prevents us from knowing the mechanisms by which the effect is achieved (Fidalgo et al., 2017). In this regard, for example, Graham, Harris and co-workers (Graham & Harris, 1989; Sawyer, Graham, & Harris, 1992) examined various decompositions of SRSD. In the first study, Graham and Harris (1989) compared strategy-focused instruction with and without components explicitly aimed at developing self-regulation skills (goal setting and self-monitoring). The authors found similar benefits in both experimental conditions. Sawyer et al. (1992) extended previous research on components and added a third “Direct teaching” condition. In this condition, the authors removed modelling and collaborative practice. Again, students in all three conditions (SRSD, SRSD without goal setting and self-monitoring and Direct teaching) showed benefit relative to practice-only controls, with no evidence of difference among conditions.

In this context, the aim of the present study is to explore a central component of strategy-focused instruction about whether teaching procedural knowledge, specifically planning processes, is necessary when the discourse knowledge is taught exhaustively (Torrance et al., 2015). As Torrance (2015) argues, there are two compatible ways to teach students to plan. The first and most common way is to teach explicit planning procedures, such as advance planning (Bouwer, Koster, & van den Bergh, 2018; Saddler et al., 2019; Shen & Troia, 2018). Of course, teaching advance planning procedures can help young writers manipulate content before they start writing and reduce the cognitive overload when composing (Graham & Harris, 2007; Kellogg, 2018; Rijlaarsdam et al., 2011). However, studies which explore the benefits of advance planning and textual quality have shown mixed results that vary by students’ ages and writing tasks (Limpo et al., 2014; López et al., 2019; Olinghouse & Graham, 2009).

Alternatively, the planning process can be taught focussing attention on setting product-goals, in the hope that during writing, students spend time planning content based on those goals (Torrance, 2015). For example, Torrance et al. (2015) implemented the Cognitive self-regulation instruction program (CSRI; based on the SRSD model) to develop 6th grade students’ self-regulated mastery of their writing (Zimmerman, 2000). The authors compared the effects of full CSRI instruction (including explicit instruction in planning and revision), with a modified version (“product only”) which received the same instruction, but with all reference to planning strategies removed. In this condition students learned to set explicit product goals with the help of mnemonics (ensure text structure, use appropriate links, etc.), but without any explicit encouragement to plan their text. These two conditions were compared with practice-matched controls. Both experimental conditions exhibited improvements in the three measures (structure, coherence and quality) of their writing compared to the control, with no evidence of benefits of full CSRI over those provided by the “product only” condition. At post-test, the full-CSRI group spent more time planning, however, the use of process strategies was not associated with additional benefit to text quality.

Based on the above it is possible to conclude that, through strategy-focused instruction, students can be encouraged to use planning procedures in different and equally effective ways. Therefore, our aim is to explore and compare a full version of the CSRI program (pre-planning strategy and setting of theoretical goals supported by mnemonics) with a modified version of the same intervention without any direct reference to planning procedures (teaching only mnemonics for students to establish and work with the right product goals) in 4th grade students. Because learning writing depends not only on instructional (external) factors but also the students’ own cognitive (internal) factors (Harris et al., 2002), it is important to ask whether strategy-focused instruction CSRI is effective for 4th grade students. Moreover, providing effective instruction in the first few years of Primary Education is an essential educational goal to prevent future difficulties in higher grades (Arrimada, Torrance, & Fidalgo, 2019). Thus, our aim is to extend our existing understanding of the effects of the CSRI program to participants who are younger (4th grade students) than those who have typically been studied in previous intervention evaluations (6th grade students) (Fidalgo, Torrance, Rijlaarsdam, van den Bergh, & Álvarez, 2015; López, Torrance, Rijlaarsdam, & Fidalgo, 2017; Torrance et al., 2015). We also wanted to determine the maintenance effects 7 months after finishing the training. If the goal of strategy-focused instruction is to teach students strategies to encourage effective autonomous learning, not just during an intervention, but also in the long-term, the study of maintenance strategies becomes a key focus for educators’ instructional decisions (de Boer, Donker, Kostons, & van der Werf, 2018).

The present study

The aim of this study is to explore the short-term (6 sessions) and long-term effects, in terms of writing structure, coherence, and quality, of two forms of the CSRI program on the writing skills of 4th grade students: full-CSRI (planning process instruction and product instruction) and brief-CSRI, without explicit planning process instruction.

Based on prior empirical research about the benefits of strategy-focused instruction with typically developing students, we predict that CSRI students (both, full-CSRI and brief-CSRI) will write better compare–contrast texts than the control group (reflected in greater structure, coherence, and quality of the text) in the short-term (pre/post-test). The writing from the students in the control condition would be less coherent, less structured, and of poorer quality than the writing from students in the experimental conditions. More specifically, in relation to the question about which experimental condition will perform better with respect to the control, based on the benefits of learning pre-planning procedures (e.g., Graham, Harris, & Mason, 2005; Kirkpatrick & Klein, 2009; Kiuhara, O’Neill, Hawken, & Graham, 2012), we predict that full-CSRI students will write better compare–contrast texts (the texts will have greater structure, coherence and quality) than brief-CSRI.

As Graham and Harris (2017) argued, a significant gap in strategy-focused writing instruction is that few studies have evaluated the maintenance effects of instruction, and most have been no longer than 8 weeks (Hacker et al., 2015). This study is the first to evaluate the long-term effects (7 months) of both forms of CSRI strategy-focused instruction in fourth grade students. At the maintenance timepoint, we expect the effects of CSRI (both full-CSRI and brief-CSRI) to be maintained over and above the control condition. More specifically, the full-CSRI students will write more structured, coherent and better-quality compare–contrast texts than the brief-CSRI group and both will be better than the control. This is based on previous studies about the maintenance of effects of strategy-focused instruction that have shown that the effects last over time, more so than traditional instruction (Fidalgo, Torrance, & García, 2008; Glaser & Brunstein, 2007; Tracy, Reid, & Graham, 2009).

Method

Design

Schools participating in this study were recruited considering a minimum number of classes in each grade and ensuring that schools were closely matched demographically. A member of the research team contacted the school principals and provided information about the study and the CSRI program. Where school principals showed interest in the study, the information was presented to the school’s 4th grade teachers. Once the teachers agreed to participate and in order to ensure accordance with ethical standards (Declaration of Helsinki; Williams, 2008) we sent a letter to all of the participants’ families in order to fully inform them of the objectives and characteristics of the study. Written informed consent signed by the parents/guardians of all students was received prior to their participation in the study. Additionally, after the intervention the strategic instruction program was delivered to each of the teachers in the control group. This was to ensure that all study participants had the opportunity to benefit from the strategic intervention.

The effect of the instruction was assessed by the change in the written skills across three timepoints: immediately prior to intervention (pre-test), at the end of the intervention (post-test), and 7 months after finishing the intervention (maintenance). We wanted to wait as long as possible to evaluate the students, as long as it was optimal for them. We discounted the initial months of the school year, as the students would be adapting to their new year and new teacher (having started 5th grade). The best time was in the middle months of the school year as the students would have adapted to their current school year. The teachers in the participating classes did not assign any compare–contrast compositions to their students in the period before the maintenance session.

Participants

Initially our sample comprised a total of 10 groups of 4th grade classes (N = 240) from three different state- and privately funded mixed-schools. One class was excluded because it did not meet the study’s strict fidelity requirements. Thus, our sample comprised a total of 9 groups (N = 215) that were randomly assigned to one of the two experimental conditions or a control condition: 3 classes (n = 72) in full-CSRI (experimental condition 1), 3 classes (n = 69) in brief-CSRI (experimental condition 2) and 3 classes in the control condition (n = 74). Students’ ages ranged from 9 to 10 years old. Participant information is provided in Table 1. Univariate analysis of variance (ANOVA) indicated no statistically significant differences between the groups in relation to age F(2, 211) = 0.129, p = 0.87, η 2p  = 0.001; and for gender, the Chi square test did not show statistically significant differences in the sample (χ2 = 0.16, p = 0.91).

Table 1 Descriptive characteristics of the participants for the three conditions

Prior to the intervention, all students followed the regular curriculum of Spanish primary schools focusing on teaching different genres (narrative, expository) as well as teaching rules for correct spelling and grammar, without any strategy-focused instruction (see García, de Caso-Fuertes, Fidalgo-Redondo, Arias-Gundín, & Torrance, 2010). In relation to reading curriculum all participants had received similar forms of reading instruction based on the phonetic method (Alegría, Carrillo, & Sánchez, 2005).

Instruments and measures

Writing assessment tasks

In the writing assessment task, students had to write a total of three (pre-intervention, post-intervention and maintenance) compare–contrast texts based on the similarities and differences between various topics that were selected beforehand based on students’ interests (i.e., film versus book, traveling by car versus traveling by plane, traditional games versus computer games). The writing task topics were related to subjects covered in the students’ 4th grade curriculum. To avoid the potential effect of specific topics on measurements, the topics were counterbalanced across assessment tasks in all evaluations. Compare–contrast text is relatively more difficult than other types of expository genres in which students have to generate categories and make comparisons within them in order to produce similarities and differences (MacArthur & Philippakos, 2010; Shen & Troia, 2018).

Evaluation Session The evaluation session lasted 1 h in which a specialist researcher gave the students small cards which included the title of the topic (e.g., “traveling by car versus traveling by plane”) with a picture about the topic. Then students were asked to write a compare–contrast text. The specialist researcher provided students with two work sheets, one for rough-work (“planning sheet”) and one for the final text. Students were told that they were free to use the first work sheet if they wished but that they did not have to use it. The specialist researcher reminded the students that they had 1 h to write their text (the usual duration of the students’ Spanish language class) and encouraged them to produce the best essay that they could. The specialist researcher did not provide any help during the evaluation writing task.

Product Assessment Texts were evaluated holistically by two researchers using three measures, structure, coherence, and quality via the method described by Spencer and Fitzgerald (1993), used in several previous studies (e.g., López et al., 2019). First, two researchers had prior training with reader-based measures. Both independently rated a sample of 30 texts in three separate rounds, one round per variable. The mean inter-rater correlation (Pearson’s r) was found to be high for each measure (structure = 0.91; coherence = 0.90; quality = 0.92). The Structure was assessed on a four-point scale, ranging from 1 = lack of any obvious structure to 4 = well structured. The evaluators were asked to identify whether the text presented a clear reference to introduction, development, and conclusion components. Coherence was also assessed on a four-point scale, with 1 = incoherent and 4 = entirely coherent. This score was based on whether it was possible to identify the main argument, whether the text presented clear progression of ideas without digressions and whether the text maintained local cohesion. Quality was assessed on a six-point scale, with 1 = not suitable, hard to understand and 6 = excellent. Scores were based on the extent to which the text included rich ideas, diverse and appropriate vocabulary, interesting detail, as well as taking into account correct sentence structure, punctuation, and spelling. After the prior training, the evaluators independently rated all of the texts. Again, the mean inter-rater correlation (Pearson’s r) was found to be high for each measure (structure = 0.90; coherence = 0.91; quality = 0.91).

Instructional program

The strategy-focused instruction used was the CSRI program, which aims for students to achieve cognitive self-regulation by developing strategic knowledge about how to produce good compare–contrast texts. The full-CSRI (experimental condition 1) taught students both the features of good compare–contrast texts along with planning and drafting strategies. However, in brief-CSRI (experimental condition 2), students learned through a strategic approach to set appropriate product goals for the compare–contrast texts (ensure appropriate text structure, coherence, organization and style for the reader) but without any explicit encouragement to plan their text. These two experimental conditions were compared with a practice-matched control group. The features of the three conditions are summarized below.

Experimental Condition 1: Full-CSRI The first experimental condition included the three components (two sessions for each) in sequence in a total of six sessions: direct teaching, modelling and peer practice. The focus of this condition is to help student to achieve self-regulated planning procedures (self-reflections, self-instructions) focused on the product form and process goals.

The direct teaching component involved two sessions in which students were introduced to the strategy planning process (session 1) and then to the drafting process (session 2). At the beginning of session 1, the instructor taught a metacognitive matrix identifying the nature, purpose and central features of effective planning processes. Then, students were introduced to the mnemonic POD + the vowels OAIUE, to scaffold planning their compare–contrast texts. POD stands for each of the steps: (1) Think of ideas—Pensar—before writing; (2) Organize your thoughts with the OAIUE mnemonic rule [O (Objective) prompted students to identify the purpose of different text types; A (Audience) prompted students to capture future readers’ interest and attention, motivate them to read, make it easier to understand, etc. I (Ideas) prompted students to think of ideas, brainstorm or search for other documentary sources, and differentiate between main and secondary ideas and examples. U (Union) reminded students to connect the ideas in the text, joining thematic ideas (e.g., similarities vs. differences in compare–contrast texts); E (Esquema-Plan) reminded students to make a plan including ideas about the introduction, development and conclusion]; and (3) Develop the text. The strategy was supported by a chart showing the POD + vowels mnemonics. Moreover, to facilitate students learning about the OAIUE vowels, the knowledge was illustrated in a compare–contrast text. Similar to the first session, in the second session students were given a metacognitive matrix identifying the nature, purpose and central features of effective drafting processes. Then, the instructor taught the mnemonic IDC + the vowels OAIUE. IDC encourages the organization and structure of a compare–contrast text: (1) Introduction in which students should present the topic, the purpose of the text and capture the readers’ interest; (2) Development in which students were instructed to develop the ideas and examples to explain these ideas and; (3) Conclusion which reminded students to make a personal contribution to the text, an overall point of view, or reflection of everything discussed in the text. Again, the vowels provide criteria about the content during all three of the IDC production phases.

In the modelling component the teacher demonstrated the correct writing process steps for the planning strategy (session 3) and drafting strategy (session 4). Modelling involved Think aloud while composing a compare–contrast text in front of the class. Think aloud was mainly scripted. The teacher emphasized explicit references to the strategies with a self-regulatory approach to the task (“The first thing I am going to do is, as the letter E—Esquema—says, a Plan of my text… But first of all, it is important that everything in my plan is done thinking of the letter A-Audience”), and with self-statements about positive expectations (“If I make a last effort I will do it”) producing a written plan (session 5) and draft text (session 6). The teacher explained to the students that during the modelling they had to concentrate on all the teacher’s steps and thoughts during the writing process (“the first thing I have to do is plan my text”) not on the ideas about the topic (“the similarity between cars and airplanes is that both are transport”). After modelling, students made notes about the model’s most important thoughts. Then, the instructor facilitated a whole-class discussion drawing together the students’ observations. At the end, each student individually wrote down reflections about the differences between their own writing practice and the processes that they had seen.

Finally, in the peer practice component students worked in pairs, emulating the planning (session 5) and drafting strategies (session 6) during writing. Session 5 started with the instructor reminding students about the mnemonic POD + the vowels. The instructor emphasized to the students that in order to emulate the planning process, they had to follow the modelling demonstrated by the teacher in the previous sessions. Consequently, the teacher reminded students that the they had to use self-instructions with Think aloud to regulate what they were doing. For this purpose, students had a sheet with a list of self-instructions (created by the teacher during modelling). Once students had been reminded of the background information, the instructor selected students with similar abilities and paired them. Students were assigned to writer or helper roles. The more extrovert student or the one more likely to Think aloud was the writer, and the other was the helper. These roles were maintained throughout both sessions. During composition, Think aloud helped expose the writing processes adopted by the writer to the helper’s observation and comments. While the pairs were planning their texts (session 5), the students had in front of them a sheet with spaces for the their own notes in the margins laid out following the POD + vowels criteria (so students would not forget to do it) and the sheet with a list of self-instructions to facilitate Think aloud during planning. In the second session of this component (session 6), the writer took the outline created in the previous session and translated it into text. This session then followed an identical pattern to session 5, with a focus on the IDC mnemonic. In both sessions, during Think aloud composing, the instructor also patrolled the class, listening to the Think aloud and providing feedback and help for the writer about how to perform the thinking aloud and apply the strategies taught.

Experimental Condition 2: Brief-CSRI The focus of this condition was to help students to achieve self-regulated planning procedures by focusing on the product form. As in the experimental condition 1, in initial sessions students were introduced to the OAIUE mnemonic in order to ensure good comparison–contrast texts, but without any direct reference to planning strategy. The concepts underlying the OAIUE mnemonic (objective, audience, content, coherence, and structure) were illustrated in different genres (descriptive, argumentative), and particularly, in compare–contrast texts. Sessions 3 and 4 followed the same sequence as experimental condition 1, that is, modelling followed reflection and class-discussion. However, in this experimental condition the teacher modeled an example of good (session 3) and incomplete (session 4) compare–contrast texts with only references to product goals structured around the OAIUE mnemonic. In the last sessions (5 and 6) students worked alone during the writing of an expository text (session 5) and compare–contrast text (session 6). During writing, as in experimental condition 1, the students emulated the Think aloud that they had seen the model use in the previous sessions.

Control Condition The control condition was production-focused without any strategy instruction (without teaching explicit strategies for process and setting product goals) but with the same level of practice as the experimental conditions. The instruction focused on structural and linguistic features of the compare–contrast text. The instructional program is described briefly below.

In session 1, the instructor focused on teaching different types of texts and their characteristics. The instructor started the first session with a brainstorm on the importance of quality writing. After that, the instructor taught the objectives of three text types (argumentative, descriptive and compare–contrast text). Then, the instructor presented the students with two tasks in which they had to identify textual examples for each of the previously taught text types. In session 2 the instructor presented the structure and characteristics of each of the three types in detail. To make it easier for students to memorize content, the teacher gave the students different text types to identify and analyze the characteristics.

In sessions 3 and 4, students used different questions to analyze the specific characteristics of correct (session 3) and incomplete (session 4) examples of compare–contrast texts. Students individually practiced writing an expository text (not necessarily a compare–contrast text) in session 5, and specifically a compare–contrast text in session 6. After finishing the task, students read the texts in the class-group and the instructor and the class were able to provide feedback on whether the text had the required characteristics of a compare–contrast text.

Procedure

Training delivery

The study was conducted during the spring school term. Table 2 presents a timetable of study procedures. The sessions took place in literacy lessons. The full implementation of the program was carried out by 9 teachers (educational professionals with master’s degrees in Primary Education) one for each class. Classes were composed of 20–25 students. Each instructor taught the full 6 sessions of the program. All sessions lasted for approximately 60 min in all conditions. Training teachers was a principal element through which we ensured the program was carried out by establishing what the teachers had to do and how they had to do it. Previous research has demonstrated the positive relationship between training teachers and students’ writing performance (e.g., De Smedt, Van Keer, & Merchie, 2016).

Table 2 Timeline of study procedures

Teacher preparation

Prior to the start of the intervention, a member of the research team (Ph.D. student specialist in Educational Psychology and writing instructional researcher) who guided the study methodology presented the CSRI program to the teachers (background, implementation schedule). Then, in order to facilitate the implementation of the CSRI program, all the teachers were given the complete set of materials for each student (individual portfolios) and a “teacher session manual” containing detailed descriptions of the 6 sessions. The manual contained: (a) Instructions for how to start, carry out, and finish each session; (b) The specific materials to be used for each step of the session and how to address the students; (c) Instructions about how to talk to the students and activities for them.

The researcher asked teachers to read the session information carefully before the start of training to discuss and clarify any questions during the training sessions. There was a total of three training sessions (one for the direct teaching component, another for modelling and a final session for the peer practice component). Each training session was delivered a week before it was implemented. All sessions lasted for approximately 60 min following the same two-part pattern. In the first part of the session, the researcher started with an explanation of the specific component and its goal. In the second part of the session the researcher explained and discussed the steps described in the teacher’s portfolio. Specifically, in the modelling training session, an example of Think aloud was provided for use in the instructional session and was trained during the session. In this second part of the session, teachers were able to ask questions and resolve any issues they may have had about implementing the sessions.

Treatment fidelity

We used the following measures to ensure that the teachers implemented the program in the right way. First, all teachers were given manuals including the elements and activities for each session. Second, a specialist researcher met with the teachers weekly to train them in applying the instructional procedures. Third, the student portfolios with the set of materials were reviewed following the sessions, allowing us to check whether the students had correctly completed the tasks. Evidence from student portfolios suggested that training was delivered correctly. Fourth, intervention, evaluation and teacher preparation sessions were recorded in audio. The first author listened to the intervention session recordings and noted whether each step or procedure was completed. The fidelity for the three teachers of the full-CSRI group (experimental condition 1) averaged 96.8% (SD = 0.18, range 80–100), 97.1% (SD = 0.16, range 80–100) and 97.6% (SD = 0.19, range 80–100) respectively. For the three teachers of the brief-CSRI group (experimental condition 2) the mean was 97.6% (SD = 0.19, range 80–100), 90.3% (SD = 1.16, range 80–100) and 97% (SD = 0.24, range 80–100) respectively.

Data analysis

The data was analyzed using the SPSS 24.0 program (IBM, Chicago, IL, USA). Nested data structures suggest HLM analysis. However, this data analysis strategy is appropriate only when certain conditions are met (Gelman & Hill, 2006; Goldstein, 2003). One of those requirements is the sample size regarding each of the levels of the hierarchical structure (Maas & Hox, 2005). These authors indicated that at least 50 classes constitute a sufficient sample size for accurate estimation. Since there were only nine classes in the present study, the effect of nesting was not taken into account in the analyses.

Consequently, we used Analysis of Covariance (ANCOVA) to evaluate the effects of the intervention at post-test and maintenance timepoints. The dependent variables were the three measures (structure, coherence and quality), the independent variable was treatment condition (full-CSRI, brief-CSRI, and control). The covariate in the post-test analysis was students’ pre-test performance; and at maintenance the covariate was the post-test performance. Comparisons to check for specific differences between the three conditions were carried out by means of post hoc tests. We used Cohen’s (1988) criteria to interpret effect size, which states that the effect is small when η 2p  = 0.01 (d = 0.20), medium when η 2p  = 0.059 (d = 0.50), and large when η 2p  = 0.138 (d = 0.80).

Results

Preliminary results

Table 3 shows bivariate correlations between the reader-based measures at pre-test, post-test, and maintenance. Correlations between the three reader-based variables (Pearson’s r = < 1) suggest good discriminant validity in the context of this study. Given that the asymmetry and kurtosis values of the variables (structure, coherence and quality) were within the intervals that denote a normal distribution (Kline, 2011), we used a parametric analysis.

Table 3 Bivariate correlations, skewness and kurtosis among reader-based measures at pre-test, post-test, and maintenance

We also analyzed differences in pre-test measures regarding condition. The results indicated that the differences were statistically significant for the condition in relation to quality F(2, 212) = 6.716, p = 0.001, η 2p  = 0.06 but not significant for structure (p = 0.648) or coherence (p = 0.508). Table 4 summarizes the means and standard deviations for each variable (structure, coherence and quality) by condition, in the pre-test, post-test and maintenance assessments (Fig. 1).

Table 4 Means and standard deviations for coherence, structure and quality by condition
Fig. 1
figure 1

Evolution in the performance in each variable by condition

Compare–contrast text results

Post-test results

For the structure variable, after controlling for differences in pre-test performance, ANCOVA showed that the condition was not significant in the post-test measure F(2, 211) = 2.392, p = 0.094, η 2p  = 0.02.

For the coherence variable, after controlling for initial differences in pre-test performance, the ANCOVA showed that the condition was significant in the post-test measure F(2, 211) = 9.222, p ≤ 0.001, η 2p  = 0.08. Post hoc analysis showed statistically significant differences between the full-CSRI group (experimental condition 1) and the control group (p ≤ 0.001), and also between the brief-CSRI group (experimental condition 2) and the control group (p = 0.019). We found no differences between the full-CSRI (experimental condition 1) and the brief-CSRI (experimental condition 2) groups (p = 0.727).

Finally, for the quality variable, the ANCOVA showed that the condition was significant in the post-test measure F(2, 211) = 7.061, p = 0.001, η 2p  = 0.06. Post hoc analysis showed statistically significant differences between the full-CSRI group (experimental condition 1) and the control group (p = 0.011). We found no differences between the brief-CSRI and the control groups (p = 0.98), or between the full-CSRI (experimental condition 1) and brief-CSRI (experimental condition 2) groups (p = 0.11).

Maintenance results

For the structure variable, after controlling post-test differences, the ANCOVA showed that the condition was significant in the maintenance measure F(2, 99) = 8.606, p ≤ 0.001, η 2p  = 0.14. Post hoc analysis showed statistically significant differences between the full-CSRI (experimental condition 1) and control groups (p ≤ 0.001), and also between the brief-CSRI and control groups (p = 0.030). We found no differences between the full-CSRI (experimental condition 1) and brief-CSRI (experimental condition 2) groups (p = 0.980).

For the coherence variable, after controlling for post-test differences, the ANCOVA showed that the condition was not significant in the maintenance measure F(2, 99) = 2.302, p = 0.10, η 2p  = 0.04.

Finally, for the quality variable, after controlling post-test differences, the ANCOVA showed that the condition was significant in the maintenance measure F(2, 99) = 9.288, p ≤ 0.001, η 2p  = 0.15. Post hoc analysis showed statistically significant differences between the full-CSRI (experimental condition 1) and control groups (p ≤ 0.001), and also between the brief-CSRI and control groups (p = 0.031). We found no differences between the full-CSRI (experimental condition 1) and brief-CSRI (experimental condition 2) groups (p = 0.53).

Discussion

To achieve self-regulating behavior during writing, teaching planning procedures is considered essential (Kellogg, 2018). However, how much and what types of planning procedures are valuable varies across students’ ages and textual genres (Galbraith, 2009; MacArthur, 2017). This study explores for the first time the short- and long-term effects of two forms of CSRI strategy-focused instruction on 4th grade students’ writing skills. Full-CSRI (experimental condition 1) taught students strategies for pre-planning and drafting a compare–contrast text. In contrast, in the brief-CSRI group (experimental condition 2), students learned a strategic approach to set product goals but their instruction did not include any mention of the planning procedure. After 6 sessions of training the results suggested that only the full-CSRI group produced texts that were assessed as being more coherent and of higher quality, but no better structured, than the control condition. Nevertheless, before a firm conclusion can be drawn about the benefits of strategy-focused instruction in a short-term period, the results need to be explained in more detail.

In relation to the structure measure, contrary to our predictions, the results suggested that students made good progress under the three conditions between pre-test and post-test with no differences between them. One reason for this lack of difference compared to the control condition is that some students’ writing performance was affected by their domain of structure knowledge (Hammann & Stevens, 2003; Murphy & Alexander, 2002). In this regard, students are familiar with the concept of textual structure because in Spanish primary schools writing instruction is typically focused on learning the structure of texts. Moreover is part of students’ cultural baggage to have read narratives with an introduction, development, and a conclusion. If students think that they already know about the concept of text structure, they may not perceive the utility of using the strategies.

Unlike structure, coherence and quality are constructs that are rarely taught explicitly in the process of learning to write (e.g., García et al., 2010). Moreover, achieving high coherence and quality in a compare–contrast text is a challenge for young students given the organizational demands of this type of text (Englert & Hiebert, 1984; Shen & Troia, 2018). Therefore, planning procedures become indispensable for the student, because planning intrinsically involves reasoning about content (about the internal set of relationships of ideas and their consistency, about how to develop an idea to capture interest and so forth) (e.g., Hayes, 2012; Torrance, 2015). In this regard, our results showed that after the end of the intervention, the students from both experimental conditions wrote more coherent texts and only the group who had full-CSRI instruction produced higher quality texts than the control condition.

On the other hand, in relation to the question about what experimental conditions are more beneficial compared to the control, the results showed that, consistent with previous findings (e.g., Kiuhara et al., 2012; Palermo & Thomson, 2018; Saddler, Moran, Graham, & Harris, 2004; Shen & Troia, 2018), teaching explicit planning procedures, such as advance planning in tandem with genre knowledge results in more competent writers. The full-CSRI condition not only had significant improvements, but these improvements were produced in both measures, coherence and quality.

Moreover, as the literature suggested, planning in advance functioned as a support or external memory (Graham et al., 2005) that helped students to simplify the process of thinking about content during the composition and reduced the cognitive overload (Kellogg, 2008). In this study, this could explain why the results from the brief-CSRI group were lower and even why the quality of their texts was similar to the control condition. Textual quality involves, but is not limited to structure and coherence, it also involves taking into account the complexity and richness of ideas, interesting detail, and correct usage of mechanics, it is, in short, the overall merit of the text. Certainly, we taught a strategic approach (the vowels mnemonic) to help brief-CSRI students achieve those product goals. However, unlike the full-CSRI students, we left them to work these considerations into their own writing processes. This alternative probably caused cognitive overload during writing. In other words, because in the brief-CSRI condition students were not taught, and did not subsequently adopt, explicit pre-planning processes, students attempted to pursue product goals to achieve quality text while also struggling with translating their thoughts into words.

Seven months after the end of the intervention, as we predicted, the CSRI students (full-CSRI and brief-CSRI) wrote better compare–contrast texts (reflected in greater structure and quality of the text product) than the students in the control condition. Again, the results showed that the full-CSRI students wrote significantly better than the brief-CSRI group. This is consistent with the few studies which have evaluated the maintenance effects of strategy-focused instruction in elementary grades (e.g., Fidalgo et al., 2008; Glaser & Brunstein, 2007; Tracy et al., 2009). In these studies, the authors found that students who had received strategy-focused instruction (in planning, drafting and revising processes) wrote better compositions compared to the control 2 weeks after the instruction (Tracy et al., 2009), 5 weeks after the instruction (Glaser & Brunstein, 2007), and even 28 months after the intervention finished (Fidalgo et al., 2008). Of course, there was, on average, a decrease in the three variables between post-test and the maintenance timepoints for all three conditions. However, the CSRI students did not fall back to their pretreatment levels and their means were above the control group. This result suggests the possibility that CSRI training might benefit from booster sessions and/or from increasing the length of the initial treatment to ensure greater maintenance effects.

Finally, we compared our effect size with those of previous studies with elementary grades (Graham & Perin, 2007; Koster et al., 2015) and CSRI studies in particular (e.g., Torrance et al., 2015). We found that the effect sizes in our study on the three measures (coherence, structure and quality) were much smaller than the large effects found in those previous evaluations of instructional strategies. The most reasonable explanation is the combination of the nature of the sample, given that students were younger than those in previous studies with CSRI programs, and the novel instructional content. Learning with the CSRI program was a challenge for the 4th grade students because they had to learn different strategies in a single, short session each week. It was also a challenge for the teachers, who have never used instructional methods such as modelling or peer practice to teach writing before.

In summary, these findings, although limited, add to our understanding of the mechanisms underlying the effects of CSRI strategy-focused instruction. More specifically teaching process strategies, particularly pre-planning in fourth grade students, is more beneficial than traditional instruction over short and long-term periods.

Limitations and future directions

Caution must be applied when interpreting the intervention effects because there was significant sample mortality and a reduction in participants between the intervention and the maintenance evaluation. Clearly, further research is needed to replicate this study using a larger sample and more homogeneous groups (i.e., different school contexts with different students, for example, students with learning difficulties). Moreover, more research is needed using online measures which allow us to ascertain the distribution of processes during composition and the contribution of different aspects to text quality (López et al., 2019). Finally, we suggest that in order to verify whether students are benefiting from developing their planning skills in a more valid and reliable way, it would be helpful to test whether student performance gains transfer to genres that are not the focus of the instruction (Torrance, 2015). In addition, we also suggest that future research explore in more detail the learning of the brief-CSRI condition. This condition may have resulted in implicit goal setting and goal setting can lead to planning.

Educational implications

It is important to note that through the CSRI program teachers and students are placed in a supportive writing environment with the message that writing needs to be taught and shown effectively to be learned, and it should not be understood as a complementary task that is learned automatically. In this study we explored three instructional interventions that may help educators to make decisions based on the evidence. Thus, we hope that educators contextualize this knowledge in order to accordingly develop and design the best writing instruction possible in their classes.