Introduction

Successful written communication in college and career requires that a person be intimately familiar with the specific academic language (AL) that is required of the task. For example, when writing a historical account, the author should write like a historian or write like a scientist when reporting scientific phenomena (Nagy & Townsend, 2012; Schleppegrell, 2001; Silliman & Scott, 2009). The key to developing the specialized written language needed in college and career is to learn the general and discipline-specific AL of text throughout schooling (Beck, McKeown, & Kucan, 2005; Silliman & Scott, 2009). AL is broadly defined as the level of formal language usage (i.e., register) in schooling, with register features that are more frequent or pervasive within the context of school (Biber & Conrad, 2009; Schleppegrell, 2001). More specifically, students in late elementary and middle school demonstrate facility with AL as they begin to comprehend and use more vocabulary words, syntactic structure of sentences, and discourse structures that are prevalent in academic text (Nagy & Townsend, 2012; Snow & Uccelli, 2009). Academic vocabulary words and expository text structures are relatively easy to identify as AL. However, syntactic structures like embedded clauses and nominalization that are more difficult to quantify are also a critical component of academic language (Nagy & Townsend, 2012; Snow, 2010; Uccelli, Barr, Dobbs, Galloway, Meneses, & Sanchez, 2014).

Linguistic researchers demonstrate that students as early as Grade 4 begin to incorporate more ‘school-like’ vocabulary words, more complex syntactic structures, and text structure specific to genre into their written composition with that growth plateauing around Grade 7 or 8 (Beers & Nagy, 2011; Berman & Nir-Sagiv, 2009). This seminal research demonstrates that students do incorporate AL in their written composition. Unfortunately, the role of linguistic knowledge is often downplayed when studying school-based written expression outcomes (Berman & Nir-Sagiv, 2009). Research has focused more on the role of handwriting, spelling, and cognitive processes (planning, text generation, translating, reviewing, and revising) of written composition for informing school-based outcomes and instruction (Berninger, Fuller, & Whitaker, 1996; Berninger et al., 2006; Hayes & Flower, 1980). In each of these processes of writing, highly effective interventions exist that improve writing outcomes for both late elementary and middle school students (Graham et al., 2012, 2016).

The role of AL has recently been established for students’ reading comprehension outcomes in schools (e.g., Foorman, Petscher, Stanley, & Truckenmiller, 2017; Uccelli, Galloway, Barr, Meneses, & Dobbs, 2015), but less is known about how AL relates to students’ written composition outcomes. We first draw on literature about the broader role of language for reading and writing outcomes and how AL plays a similar role in reading achievement. Next, we draw on the well-established evidence that reading and writing outcomes share underlying components. Third, we explore one more characteristic of AL that might be important for understanding the role of AL in written composition. That characteristic is the way that AL skills are integrated with lower-level word recognition skills and higher-level reading comprehension skills. Finally, we introduce our study that (a) explores how AL is related to written composition and (b) investigates the interrelation between AL and other reading skills to predict written composition.

Academic language register

Studies frequently confirm the large and significant role of students’ skill with the three levels of language mentioned above for performance on school-based reading comprehension (e.g., Vellutino, Tunmer, Jaccard, & Chen, 2007) and written composition outcomes (Berninger & Abbott, 2010; Silverman et al., 2015). These studies suggest that all students must draw upon their knowledge of word meanings (vocabulary), sentence structure (syntax), and the organize structure of discourse in order to perform on reading and writing outcome measures in school. AL focuses on more precise vocabulary, more complex syntactic features, and denser discourse structure than more general language (Beck et al., 2005; Scott & Silliman, 2009; Snow, 2010; Snow & Uccelli, 2009). Recently, two groups developed school-based measures that more precisely focus on the AL register of language when measuring students’ abilities within the language comprehension component in the Simple View of Reading (Foorman, Petscher, & Schatschneider, 2015b; Uccelli et al., 2014). The Core Academic Language Skills instrument (CALS-I; Uccelli et al., 2014) measures language forms and functions inherent to school-based tasks across disciplines in Grades 4 through 8, including morphological decomposition, complex syntax, connectives, anaphoric resolution, argumentative text organization, and identification and production of academic register. They found that the six tasks represented one unitary construct of AL (Uccelli et al., 2014). In a reading screening and diagnostic measure for Grades 3 through 12, Foorman et al. (2015b) focused their language comprehension tasks on the academic register. To measure vocabulary, they used a morphological decomposition task with words common across school-based disciplines and to measure syntax they employed complex syntax (embedded clauses and nominalization), connectives, and anaphoric resolution. Both groups demonstrate that, in Grades 4 through 8, general AL knowledge plays a significant and large role in reading proficiency (Foorman et al., 2017; Uccelli et al., 2015). We anticipate that a similar relationship will occur with written composition outcomes due to the relationship between reading and writing.

Writing and reading share academic language components

Although there are other theories about why students tend to achieve similarly in reading and writing (e.g., rhetorical relations and procedural approach), we draw upon the shared knowledge theory (Fitzgerald & Shanahan, 2000) because it suggests that there are malleable underlying components that can be taught in school. The shared knowledge view expects that students’ development of reading and writing should parallel each other because both rely on students’ knowledge of language structure at the word (vocabulary), sentence (syntax), and text (discourse) levels (Englert, Okolo, & Mariage, 2009). Empirical evidence of this theorized relationship is growing. Two longitudinal studies of elementary and middle school development provide evidence of the shared knowledge between reading and writing at all three levels of language (Abbott, Berninger, & Fayol, 2010; Ahmed, Wagner, & Lopez, 2014). When writing text, Carretti, Re, and Arfé (2013) found that poor reading comprehenders aged 8–10 were likely to have more spelling errors (word level) and draw fewer causal connections (sentence and discourse level) between events than good comprehenders. Intervention research also suggests that educators may be able to have an impact on both reading and writing outcomes by focusing on shared knowledge components with writing intervention having a small to moderate impact (effect sizes range from 0.28 to 0.68) on reading comprehension (Graham & Hebert, 2011). However, intervention research examining the reciprocal relation—the influence of discourse-level reading instruction on text-level writing—is not as strong (Graham & Harris, 2017). After an extensive review of the literature to find a rationale, Graham and Harris (2017) call for more research that investigates the size of the role of each level of reading skills for writing proficiency.

The role of academic language may be integrated with other skills

As we further explore the role of AL in written composition, we hypothesize that AL knowledge will be integrated with other underlying literacy skills at the discourse and word-levels, namely reading comprehension and word recognition. We aim to demonstrate the overlap of these skills because it may be important for eventually informing instruction. For example, when vocabulary instruction is combined with discourse-level instruction, the strongest impacts on reading outcomes are achieved (Herrera, Truckenmiller, & Foorman, 2016; Scammacca, Roberts, Vaughn, & Stuebing, 2015; Wright & Cervetti, 2017). Researchers suggest that it is the interdependence of component reading skills that may make integrated instruction more effective (e.g., Abbott et al., 2010).

Several studies of reading outcomes suggest that language overlaps with reading comprehension at the discourse level and word recognition at the word level (Foorman, Koon, Petscher, Mitchell, & Truckenmiller, 2015a; Foorman, Petscher, & Herrera, 2018; Kieffer, Petscher, Proctor, & Silverman 2016; Silverman et al., 2015; Uccelli et al., 2015; Vellutino et al., 2007). Specifically, Foorman and colleagues (2018) demonstrate that word recognition and language together explain nearly 100% of the variance in reading comprehension and that word recognition and language share 19% common variance in reading at Grade 4 and 31% common variance at Grade 8. Although it has not been directly tested, researchers hypothesize that there is overlap between these skills in Grades 4 through 8 for written composition outcomes as well (e.g., Abbott et al., 2010; Silverman et al., 2015).

Scarborough (2001) illustrates this unique and shared contribution of literacy skills as a rope, with each skill represented as a strand of the rope. When students are first learning a skill, each skill is a separated strand. As students become more strategic drawing on other levels of language knowledge, the strands twist together into a strong rope representing skilled reading. The current study will explore this skill integration for written composition proficiency. Given Scarborough’s description and Foorman and colleagues’ findings, we anticipate a high amount of unique variance at Grade 4 and a higher amount of common variance at Grade 8. The specific role of each component predicting written composition is explored next.

Academic language knowledge at the word level

The most commonly studied component of AL is general academic vocabulary. General academic vocabulary words are common in elementary and middle school text and discourse and include words that cross disciplines, words like ‘function’, ‘abstract’, and ‘constitute’ (Beck et al., 2005). By contrast, discipline-specific vocabulary words (e.g., polynomial, cytoplasm, federalism) appear in elementary and middle school subjects but are most common in high school. Knowledge of general academic vocabulary is a well-established component of reading comprehension (e.g., Nagy & Townsend, 2012; Wright & Cervetti, 2017).

When examining writing, oral vocabulary plays a significant role in written composition proficiency, but only when integrated with syntax and morphology skills (Silverman et al., 2015). One rationale for this integration is that the meaning of vocabulary words may be determined through morphological knowledge of the root words and affixes or through inference at the sentence or discourse-levels (Nagy & Townsend, 2012). In their measure of AL, Uccelli et al. (2014, 2015) demonstrated that morphological and syntactic knowledge were two important components of their unidimensional construct of AL. The measurement of word level academic language knowledge in the current study reflects this dependence and requires morphological decomposition and the items are presented within a sentence context.

Academic language knowledge at the sentence level

The next most studied AL construct is syntax. A student’s knowledge of sentence structure is generally recognized as making a substantial contribution to their comprehension and writing of academic text (RAND Reading Study Group, 2002; Scott, 2009; Silverman et al., 2015; Snow & Uccelli, 2009). However, syntactic knowledge only contributes to higher-level outcomes when combined with other components of language. When combined with other components of language, syntax plays a significant role in both reading outcomes (Foorman et al., 2015a; Kieffer et al., 2016; Foorman et al., 2017, 2018) and written composition outcomes (Berninger & Abbott, 2010; Silverman et al., 2015). Intervention research parallels these findings. Interventions focused on syntax skills significantly affect sentence composition, but not higher-level written composition (Datchuk & Kubina, 2012). Given these previous findings, we anticipate that sentence level academic language will be influential for written composition and that most of the influence will be shared with reading comprehension and word recognition.

Reading comprehension

Knowledge of the structure of academic text facilitates content knowledge acquisition (Englert et al., 2009; Silliman & Scott, 2009; Snow & Uccelli, 2009). At the discourse or text-level, students comprehend the speaker/author’s message through understanding how concepts across sentences relate to each other. For example, social studies text may have a compare-contrast or persuasion structure (Englert et al., 2009). When assessing students’ skills, text structure is typically operationalized as a part of reading comprehension or written composition quality. However, performance on higher-level skills like reading comprehension requires integration of a large number of cognitive abilities and lower-level skills, including the other three skills measured in this study. Therefore, we expect that the roles of vocabulary, syntax, and word recognition will significantly overlap with reading comprehension. Given the relation between reading comprehension and written composition at the highest level of language (e.g., Abbott et al., 2010), we hypothesize that reading comprehension will play the largest role in determining proficiency in written composition.

Word recognition

At the word level, both reading and writing require that students have knowledge of the correspondence of letters and sounds. In reading, letter-sound correspondence is represented by word recognition and in writing it’s represented as spelling. Knowledge of letter-sound correspondence is one of the most well-documented components necessary for both higher-level reading and writing outcomes (e.g., Berninger et al., 2006; Graham & Harris, 2017). Although most students master letter-sound correspondence for word recognition earlier in elementary school, spelling remains an area of development for a significant number of low-performing students throughout middle school and beyond. In fact, spelling instruction has demonstrated impacts on spelling in context for late elementary students and reading comprehension for middle school students (Graham & Santangelo, 2014) thus supporting the need for an underlying knowledge of word structure in the shared knowledge view of reading and writing for students in elementary and middle school (Graham & Harris, 2017). In the current study, the word recognition task is closer to a spelling recognition task than a general word recognition measure. It requires students to recognize the letter-sound correspondence and correct orthographic representation of a spoken word from among three words that are similarly but incorrectly spelled. Because knowledge of word structure is not specific to AL, measurement of word recognition in this study is not considered part of the AL construct. However, English is a morphophonemic language with strong relations between students’ morphological, phonological, and orthographic skills appearing in Grades 4 and 5 and continues to be important across middle school as the morphological complexity of words in academic text becomes more complex (Deacon & Kirby, 2004). Because of the close relation between morphology (e.g., roots and affixes common in AL) and word recognition/spelling across the middle grades (Foorman et al., 2015a; Garcia & Cain, 2014), we investigate the role of word structure knowledge in relation to AL across both grade levels in our study.

Purpose of the current study

In the current study, we use a reading screening and diagnostic assessment that was designed to measure reading comprehension, AL (at the word and sentence level), and word recognition to determine if the underlying shared linguistic knowledge of each of these skills relate to written composition proficiency in a way that is similar to how they relate to reading outcomes. First, we explore the role of AL for written composition. Next, we replicate the findings that interrelated reading skills (word recognition and reading comprehension) also represent knowledge necessary for written composition. Finally, we explore the integration of AL with reading comprehension and word recognition for proficiency with written composition. Identifying the role of AL as a significant but not independent component of written composition may be a step to identifying more effective classroom instruction in elementary and middle school.

Method

Participants and setting

A total of 1316 students in Grade 4 from 15 elementary schools and 1302 students in Grade 8 from five middle schools in a large district in Florida participated. Demographic data of the participating students are detailed in Table 1. Compared to the State of Florida demographics for Grade 4, the participating students were approximately representative except for two categories. The State of Florida had a higher percentage of Black students (22%) and a higher percentage of economically disadvantaged students (61%). The Grade 8 participants represented a more diverse range of students than the Grade 4 participants. Two-thirds of the participating students in Grade 8 were considered economically disadvantaged, which is higher than the State average for Grade 8 (47%). In the Grade 8 sample, there were also more Hispanic and fewer White participants than the population of Grade 8 students in Florida.

Table 1 Demographics characteristics of the participating students

Measures

FCRR Reading Assessment (FRA)

The FRA (Foorman et al., 2015b) is a screening and diagnostic assessment designed to measure the teachable components most predictive of reading comprehension. There are four subtests for Grades 3 through 10 FRA, each of which was designed using Item Response Theory to be computer adaptive.

Word recognition

Word Recognition requires students to listen to a word and choose the correctly spelled real word or non-word from a list of three words. The real words in this assessment were sampled from words that appear in academic text in Grades 3 through 10 (Zeno, Ivens, Millard, & Duvvuri, 1995). Distractors were designed to be orthographically challenging but not phonologically plausible alternate spellings, thus measuring students’ knowledge of letter-sound correspondence. Concurrent validity with the Test of Word Reading Efficiency, 2nd edition ranges from 0.30 to 0.46 (Foorman et al., 2015b), indicating some convergent validity with single word oral decoding fluency and divergent validity that may be due to the different nature of the Word Recognition task (recognition versus oral production), decoding versus encoding, or timed versus untimed. The average marginal reliability of this subtest, as reported in the technical manual, for Grades 4 through 8 is 0.93.

Vocabulary knowledge

The Vocabulary Knowledge task requires students to read a sentence and choose from a list of three words the correct morphological structure that fits the sentence. For example: In some states you can get a driver’s [permission, permissive, permit] when you are 14 years old. In this example, the correct response (permit) is the root and the distractors add different derivational suffixes that do not appropriately fit the sentence, thus requiring a morphological decomposition strategy for understanding the vocabulary. Words were chosen based on their frequency in academic text (Zeno et al., 1995) and expert knowledge of the general academic words common in grades 3 through 10 (Foorman, Petscher, & Bishop, 2012). The average marginal reliability of this subtest for Grades 4 through 8 is 0.91. Correlations with the Peabody Picture Vocabulary Task-4th Edition range from 0.47 to 0.67 (Foorman et al., 2015b) indicating some convergent validity with word-level vocabulary and some divergent validity that may be due to the morphological, reading, or sentence-level AL skills needed to perform this task. It should be noted that although this subtest mostly consists of the morphological component of vocabulary knowledge, it is named Vocabulary Knowledge for ease of use by educators.

Syntactic knowledge

Syntactic Knowledge items consist of one to two sentences that are missing a word or short phrase. For example, Turtles are most vulnerable during the first hours of life. [Consequently, Nevertheless, Otherwise], only one in every thousand turtles born will become an adult. The student must choose the connective that best represents the relationship between two or more ideas, thus creating cohesion and complexity at the sentence-level. Most items evaluated a student’s knowledge of text cohesion through temporal (before, then), logical (furthermore, in conclusion), causal (provided that, therefore), or adversative (although, by contrast) connectives. The connectives were drawn from the most common connectives in text identified by Coh-Metrix (McNamara, Louwerse, Cai, & Graesser, 2005). Some items involved pronoun reference (anaphora and cataphora) and verb tense agreement. Although some connectives, pronoun reference, and verbs used in this assessment occur in oral language, students’ performance on each item depends on the students’ parsing of the rest of the sentence that have the qualities of academic syntax, including a higher density of nominalization, embedded clauses, adverbial fronting, and more academic content. Given the broader array of linguistic skills needed to complete this task, we will refer to the construct this task measures as sentence-level academic language comprehension. Concurrent validity with the Grammaticality Judgement subtest of the Comprehensive Assessment of Spoken Language ranges from 0.37 to 0.61 (Foorman et al., 2015b) indicating some convergent and divergent validity with an assessment of oral sentence-level grammar knowledge. The average marginal reliability of syntactic knowledge for Grades 4 through 8 is 0.93.

Reading comprehension

This assessment consisted of passages (200–1300 words) with a set of seven to nine multiple-choice questions per passage. The questions were designed to measure general academic vocabulary as specified in the Common Core State Standards (CCSS) Language strand and the structure of academic text, both informational and narrative, as specified in the CCSS Reading for Information and Reading Literary text strands. Concurrent validity was established (r = 0.67–0.74) with the Stanford Achievement Test—10th Edition (Foorman et al., 2015b). This assessment was designed to mirror academic text in literature, science, and social studies. Twenty-four percent of the passages provided information on science phenomenon (e.g., hurricanes, electricity, properties of air pressure, decomposition), 27% focused on social studies content and 42% were primarily narrative. Performance explicitly required the other lower-level skills. Students needed to decode the passages and questions (word recognition); 30% of the multiple choice questions were specifically targeted to the student’s understanding of the general academic vocabulary within the context of the text (vocabulary knowledge); and many questions required students to understand the connection of ideas within and across sentences (syntactic knowledge). In this study, 98% of the sample obtained a marginal reliability of 0.80 or higher.

Florida Comprehensive Assessment Test (FCAT) 2.0 Writing

In 2013, all eligible students in Grades 4 and 8 in Florida were required to take the FCAT 2.0 Writing, a 60-minutes written composition task. In Grade 4, students were instructed to imagine a time that they won something special and to write a narrative about a time they won something special. In Grade 8, students were instructed to compose a persuasive response that would convince a person to visit their town. Each essay was scored by two trained raters on a holistic scale ranging from 1 point to 6 points. The inter-rater correlation was 0.71 for Grade 4 and 0.72 for Grade 8 (Florida Department of Education, 2014). In the current study, students who received a failing score (< 3.5) were assigned a dummy score of 1 and students who received a passing score (≥ 3.5) were assigned a dummy code of 0. Because the FRA is a screening tool designed to identify students who are at-risk for failure, the models in this study predict writing risk (failing the FCAT 2.0 Writing) instead of writing proficiency (passing the FCAT 2.0 Writing). We chose the dichotomous outcome over the holistic scale because the holistic scale of 1 point to 6 points is not an equal interval scale and therefore, would not yield a useful interpretation.

Procedures

The FRA was group-administered in computer labs at the winter assessment period (January through March) of the district’s typical interim reading assessment process. The FRA results were computer-scored instantaneously and produced a z-score for each of the subtests. The z-score reflects the entire range of student performance in the representative normative sample that ranges from Grades 3 through 10. The FCAT 2.0 Writing was group-administered to participants at the end of February following standard procedures.

Data analysis

A series of generalized linear mixed models were used to study the relation between reading predictors with writing risk outcomes. Mixed models were used to account for the non-independence of observations whereby students were nested within schools. Although students were also nested within classrooms, such information was not available, thus a two-level model was used. Eight total models were estimated at each grade level. The first model was a null model that estimated the log-odds of writing risk and served two purposes; first, it provided an unconditional estimate of writing risk and second, it allowed for the base calculation of the intraclass correlation for students and schools. Model 1 included only reading comprehension as the independent variable; Model 2 included only Word Recognition; and Model 3 included the two AL predictors (i.e., Vocabulary Knowledge and Syntactic Knowledge). Model 4 included all of the independent variables. Models 5, 6, 7 systematically excluded reading comprehension, or word recognition, or AL in order to calculate total, common, and unique variance decompositions, explained next.

The pseudo-R2 from these models were used to estimate the unique, common, and total variances related to AL, Reading Comprehension, and Word Recognition in predicting writing risk. Because reading comprehension is a higher-order skill that comprises AL and word recognition, two separate sets of models were run: one that included Reading Comprehension and one that excluded Reading Comprehension. For the ‘no reading comprehension’ models, the unique effect of AL was calculated by subtracting Model 2 pseudo-R2 from Model 5 pseudo-R2 (i.e., subtracting the word recognition model from the word recognition plus AL model). Similarly, the unique effect of Word Recognition was calculated by subtracting Model 3 pseudo-R2 from Model 5 pseudo-R2 (i.e., subtracting the AL model from the Word Recognition plus AL model). The common variance was then calculated as the total variance (Model 5) minus the two unique variance components.

For the variances inclusive of reading comprehension, the unique effect of AL was calculated by subtracting the word recognition and reading comprehension model pseudo-R2 (Model 7) from the Model 4 pseudo-R2 that included all components. The unique effect of word recognition was calculated by subtracting Model 6 pseudo-R2 from Model 4 pseudo-R2, and the unique effect of reading comprehension was calculated by subtracting Model 5 pseudo-R2 from Model 4 pseudo-R2. The common variance was then calculated as the total variance (Model 4) minus the three unique variance components. All analyses were estimated using the lme4 package (Bates, Machler, Bolker, & Walker, 2014) in R software.

Results

Descriptive statistics

A review of the data completeness revealed that in grade 4, 26% of the vocabulary data were missing as were 25% of Word Recognition, 21% of Syntactic Knowledge, and 1% of FCAT 2.0 Writing data; no data was missing for Reading Comprehension. Little’s test of data missing completely at random (MCAR) was not supported, χ2(26) = 74.34, p < 0.001. A review of the missing patterns suggested that the mechanism for missingness was not due to each variable itself. Similarly, missing data in grade 8 were: 54% for Word Recognition, 46% for Syntactic Knowledge, 26% for Vocabulary Knowledge, and 2% for FCAT 2.0 Writing with no missing data on reading comprehension. The MCAR test was significant, χ2(19) = 76.25, p < 0.001, but similar to grade 4, there was no apparent observed mechanism for the data missing due to the variables. In order to appropriately treat the missing data, maximum likelihood estimation was used in the generalized linear mixed models (Enders, 2010).

Means and standard deviations for the sample are reported in Table 2 and shows that 71% and 67% of the sample received a passing score on the FCAT 2.0 Writing in Grades 4 and 8, respectively. The percentage of students passing the FCAT 2.0 Writing in this sample is higher than the State percentage pass rate for the FCAT 2.0 Writing (57% for Grade 4 and 54% for Grade 8). Grade 4 students’ Reading Comprehension, Vocabulary Knowledge, and Syntactic Knowledge scores approximated the means of the normative sample (i.e., within 0.10 SD), with stronger Word Recognition skills compared to the normative distribution (i.e., 0.18 SD difference). The Grade 8 students had approximately similar Word Recognition and Vocabulary Knowledge skills compared to the normative sample (i.e., within 0.10 SD difference), but with lower Reading Comprehension skills (0.19 SD) and lower Syntactic Knowledge skills (0.26 SD). Correlations in Table 3 show moderate, positive associations among the four reading tasks (range = 0.32–0.55 in grade 4, 0.43–0.64 in Grade 8) and moderate, negative associations at each grade level between FCAT 2.0 Writing risk and the four reading tasks (range = − 0.38 to − 0.22 in Grade 4, − 0.39 to − 0.28 in Grade 8).

Table 2 Descriptive statistics for Grades 4 and 8
Table 3 Correlation matrix for Grade 4 (lower diagonal) and Grade 8 (upper diagonal)

Unique and common variance

Grade 4

Generalized mixed models results for Grade 4 are reported in Table S1. The variance in writing risk due to schools for students in Grade 4 was 7.2%, with the other ~ 93% due to student differences. The variance in written composition pass rates due to schools was 7.2% with the other ~ 93% due to student differences. The mean log-odds of risk was − 0.94 (p < 0.001) indicating that students had a 0.28 probability of being at-risk on the FCAT 2.0 Writing, a strong correspondence to the observed 29% base rate of writing risk. The model adding AL (Model 3) showed a mean log-odds of − 1.02, indicating that for students with average Vocabulary Knowledge and Syntactic Knowledge, there was a 0.26 probability of writing risk. For students with a one standard deviation increase in either AL measure, the log-odds decreased to approximately − 1.40 corresponding to a 0.19 probability of writing risk. Conversely, lower scores in either AL measure by one standard deviation increased writing risk to approximately 0.35. Model 1 shows that as Reading Comprehension changes by a standard deviation, the log-odds of writing risk changes by 0.99 units. That is, for a 1 SD increase in reading comprehension, the log-odds change from − 1.12 (predicted probability of risk = 0.25) to -2.11 (predicted probability of risk = 0.11); conversely, for a decrease in Reading Comprehension by 1 SD, the log-odds changes from − 1.12 to − 0.13 (predicted probability = 0.47). The word reading model (Model 2) demonstrates that 1 SD increase in word recognition skills changed the log-odds from − 1.05 (predicted probability of risk = 0.26) to − 1.70 (predicted probability of risk = 0.15), and a 1 SD decrease changed the log-odds to − 0.40 (predicted probability = 0.40).

Unique and commons variances for Grade 4 (Table 4) indicate that for the model non-inclusive of Reading Comprehension as a predictor (i.e., No-RC), the total student variance explained of 61.2% was due to the common aspects of AL and Word Recognition (41.1%), followed by the unique contribution of AL (14.6%) and the unique effect of Word Recognition (5.5%). The models inclusive of Reading Comprehension (i.e., RC) show a total writing risk variance explained of 65.2%, with common variance of 46.4%, the unique effect of AL at 11.1%, and the unique effects of Reading Comprehension and Word Recognition each at around 4%.

Table 4 Decomposition of total variance explained in writing risk for Grades 4 and 8

Grade 8

Results for the Grade 8 models are in Table S2. The variance in writing risk due to schools for students in Grade 8 was 4.2%, lower than that observed in Grade 4, with the other ~ 96% due to student differences. The mean log-odds of risk in the null model was − 0.79 (p < 0.001) indicating that students had a 0.31 probability of being at-risk on the writing outcome, close to the 33% observed base rate. When examining AL, Syntactic Knowledge carried greater weight in its prediction of writing risk (− 0.73) compared to Vocabulary Knowledge (− 0.47). Converted to predicted probabilities, these results show that the range of probabilities based on 1 SD vocabulary shifts (i.e., 0.23–0.43) were comparable to that of Syntactic Knowledge (i.e., 0.19–0.50). The inclusion of Reading Comprehension to the model (Model 1) shows that as Reading Comprehension changes by one standard deviation, the log-odds changes by 1.04 units. That is, for a 1 SD increase in Reading Comprehension, the log-odds change from − 0.91 (predicted probability = 0.29) to − 1.95 (predicted probability = 0.12); conversely, for a decrease in Reading Comprehension by 1 SD, the log-odds changes from − 1.12 to 0.13 (predicted probability = 0.53). The Word Recognition model (Model 2) demonstrates that 1 SD changes in Word Recognition produced predicted probability ranges from 0.20 to 0.49.

The unique and commons variances for Grade 8 (Table 4) indicate that for the model non-inclusive of Reading Comprehension (i.e., no-RC), the total student variance explained of 85.3% was due to the common aspects of AL and Word Recognition (74.8%), followed by the unique contribution of Word Recognition (8.1%) and the unique effect of AL (2.4%). The models inclusive of Reading Comprehension show total writing risk variance explained at 85.9%, with common variance of 77.8%, along with less than 1% unique effects of AL and Reading Comprehension, and a unique effect of 6.8% for Word Recognition.

Discussion

Our results elucidate the role of AL in written composition in two new ways. First, we found that measures with an AL dimension play a significant role in distinguishing between students who are proficient in written composition and those who are at-risk for failing written composition outcomes. Second, we suggest that AL is an integral dimension of the shared knowledge between reading and writing. The results are consistent for both late elementary and middle school students. Altogether, these findings underscore the shared knowledge view of reading and writing which also has implications for effective instruction.

Academic language has a significant role in written composition

Elementary and middle school students with higher AL skills are at lower risk for failing state outcomes in written composition demonstrating that knowledge of AL may serve as a protective factor for performance on written composition tests. Alternatively, lower skills in AL may serve as a risk factor for performance in written composition. Our results demonstrate that AL plays an integral role in describing the kind of language knowledge necessary for written composition, complementing previous research, which demonstrates that knowledge of oral language (more general word-level and sentence-level comprehension) are necessary for written composition (Silverman et al., 2015). Our results aid in specifying the type of word-level and sentence-level knowledge that are relevant for academic performance.

Typically, vocabulary is measured as the number of words a student knows. This type of measurement can be problematic when used to directly inform instruction because the natural instructional implication is that the definitions of more words need to be taught. Research indicates that teaching vocabulary definitions is inefficient for generalizing to reading comprehension (Wright & Cervetti, 2017). The academic and morphological nature of the current assessment highlights an aspect for vocabulary instruction that may be more generative. A more transparent implication of an assessment indicating difficulties with morphology is to teach the finite set of affixes and common root morphemes that are most prevalent in academic text and how different combinations of affixes and roots change the meaning of words (Foorman et al., 2012; NGA & CCSSO, 2010). In fact, instruction in morphological decomposition is one of the vocabulary strategies that has been effective for improving middle school students’ vocabulary and comprehension of academic text at both the word- and text-level (Goodwin, 2015; Lesaux, Kieffer, Faller, & Kelley, 2010; Lesaux, Kieffer, Kelley, & Harris, 2014; McKeown, Crosson, Moore, & Beck, 2018). Academic Language Instruction for All Students (Lesaux, Kieffer, Faller, & Kelley, 2010; Lesaux, Kieffer, Kelley, & Harris, 2014), Word Detectives (Goodwin, 2015), and Robust Academic Vocabulary Encounters (McKeown et al., 2018) are all interventions targeting morphological decomposition and have had significant impacts within experimental studies. It is important to note, however, that the use of multiple flexible vocabulary strategies are needed to promote generalization (Wright & Cervetti, 2017) and that several other types of morphology and vocabulary measures are likely to implicate other generative vocabulary learning strategies (Kieffer et al., 2016).

We suggest similar implications for sentence-level academic language comprehension. Syntax is often measured or represented as knowledge of grammatically correct sentence structure. One natural instructional implication may be to spend time teaching grammar or diagramming sentences. However, a meta-analysis of grammar instruction finds that teaching grammar is ineffective and inefficient (Graham, McKeown, Kiuhara, & Harris, 2012). The assessment in the current study evaluates students’ knowledge of the use of connective discourse markers (e.g., however, for instance, similarly) which indicate the relation between ideas in a sentence within an academic text. Although this does not perfectly represent the construct of syntax in a linguistic sense, it may be helpful for linking to research-based instruction. A transparent instructional implication of the current assessment may be to have students practice using connective words to expand or combine sentences. Expanding and combining sentences are currently recommended as best practices for teaching writing and improves sentence-level outcomes (Graham et al., 2012). However, improvements in higher-level outcomes (i.e., written composition) has not been demonstrated.

The integration of morphological instruction with sentence-level instruction makes intuitive sense because many suffixes indicate the part of speech a word serves within a particular sentence. Morphological instruction has demonstrated an impact on a sentence-level morphological task (Lesaux et al., 2014). However, there is not a preponderance of evidence to suggest that interventions at other levels of language have an impact on sentence-level outcomes. This lack of evidence is probably because sentence-level outcomes are not often included in intervention reviews (Baker et al., 2014; Wright & Cervetti, 2017) and research about the nature of sentence-level linguistic development lags behind research at the other levels of language (Ahmed et al. 2014). Future research that includes these measures is needed to explore the role of sentence-level language. Our results suggest that sentence-level academic language comprehension is one necessary but not sufficient strand of skilled written composition. Our next discussion suggests that it is necessary to consider the interdependent nature of sentence-level academic language comprehension with word-level and discourse-level skills.

The role of academic language is integrated with other skills

When skills in this study are considered separately, reading comprehension is the most influential skill for written composition, followed by word recognition, followed by AL. Clearly, the higher-level skill—reading comprehension—is most likely to predict the higher-level skill of written composition, aligning with previous research on the connection between skills being stronger at the same level of language (Abbott et al., 2010). In this study, the Word Recognition measure closely resembles a spelling task and spelling is known to predict written composition outcomes for younger students and students with disabilities (e.g., Berninger et al., 2006). Although the relatively higher amount of unique variance explained by Word Recognition in Grade 8 seems surprising at first, we hypothesize that it may be due to some poor spellers in Grade 8. Human raters are more likely to assign a lower score to a paper of the same overall quality but with more spelling errors (Graham, Harris, & Hebert, 2011). If students have lower spelling skills, they are likely demonstrating poorer spelling in their written composition. It is likely that poor spelling stands out compared to Grade 8 expectations and thus acts as a discriminating function for passing or failing the Grade 8 written composition test. We also hypothesize that the raters hired to score written composition may not discriminate between colloquial language and AL when determining which students pass or fail the written composition test (Berman & Nir-Sagiv, 2009). Additionally, it is not possible to tell if the estimates for Grade 4 are developmentally different from the Grade 8 estimates because this study is cross-sectional. It is not possible to determine if the different estimates between the two grade levels is due to developmental differences or due to the Grade 8 sample representing a higher-risk population than the Grade 4 sample.

The most striking finding is that performance on the four reading skills together explain more than half of the variance in written composition pass rates (65%) in Grade 4 and most of the variance in Grade 8 (86%). Most of the variance explained in written composition pass rates is due to the shared characteristics between AL, reading comprehension, and word recognition, with very little unique variance due to any of the individual skills, suggesting that these skills share something in common that predicts proficiency in written composition. We hypothesize that AL may be the critical component driving the large amount of shared variance because the measures of AL bridge the word-level to text-level skills that predict written composition. Large amounts of shared variance between these components are consistent with previous research in reading for Grades 4 through 8 (Foorman et al., 2018; Garcia & Cain, 2014; Kieffer et al., 2016; Nagy et al., 2006) and were suggested to be true for writing as well (Silverman et al., 2015). Given that the AL measures in our study act as a word- and sentence-level bridges between the word and text-level measures, it is likely that this intermediate level of text needs more attention than it is currently given in instruction.

Implications for education

For practical purposes, our findings of large amounts of shared variance between skills are consistent with Scarborough’s illustration of literacy skills being separate strands as students learn them and twisting together into a rope as students learn more. Knowing the components of educationally relevant outcomes and the interrelation of those components are important for guiding the focus of instruction in schools, so that the “rope” of literacy skills can be strengthened. If vocabulary/morphology, syntax, reading comprehension, and word recognition predict a large amount of variance in written composition pass rates, then instruction integrating those skills should influence written composition outcomes. Intervention studies show that instruction on individual components influence student performance, but a large enough impact on written composition outcomes may not be detected. This is true for each of the skill components measured in our study at each level of language. At the sub-word level, morphology instruction is beneficial for spelling skills, but has not been examined for broader writing outcomes (Bowers, Kirby, & Deacon, 2010). Spelling instruction is an evidence-based practice for improving writing outcomes throughout elementary school and instruction on spelling in middle school has impacts on spelling, but may need to be integrated with text-level instruction to have impacts on broader written composition outcomes (Berninger et al., 2006; Graham & Harris, 2017; Graham & Santangelo, 2014). At the word and sentence-level, vocabulary instruction is beneficial for reading outcomes (Wright & Cervetti, 2017), but not writing outcomes (Silverman et al., 2015). Syntax instruction is beneficial for sentence-level outcomes but not broader writing outcomes (Datchuk & Kubina, 2012). At the text-level, instruction in reading comprehension improves reading outcomes (Scammacca et al., 2015) and has the potential for improving written composition outcomes (Graham & Harris, 2017). Our study suggests that one potential reason for the limited influence of single skill instruction on broader outcomes is the relatively low amount of unique variance predicted by each skill. The shared variance in our study suggests that instruction in these areas may need to be integrated in order to boost impacts on the language skills required for written composition. For example, a set of intervention studies showed that morphophonemic instruction linking the root word’s origin (morphology) to its spelling improved content-area (academic) reading and spelling more than phonics instruction alone in Grades 3 through 5 (Henry, 1988, 1989, 1993). This particular instruction highlights the need to include morphophonemic word analysis as students are expected to read and write more academic text. Furthermore, the integration of effective sentence-level instruction (e.g., Datchuk & Kubina, 2012) with discourse structure instruction (e.g., Graham et al., 2012a, b) may be a fruitful area for future research for which others (e.g., Beers & Nagy, 2011) have been advocating.

Limitations and future directions

Conclusions in this study may be limited by the interpretation of what the assessments measure for both the FRA and the FCAT 2.0 Writing. As demonstrated by the results of this study, the FRA measures significantly overlap with each other. We believe the Vocabulary Knowledge and Syntactic Knowledge measures in this study meet the current state of the field because they are conceptually related to common definitions of AL and another measure of AL (CALS-I; Uccelli et al., 2014). In their measure of AL, Uccelli et al. (2014) find that AL features similar to the ones incorporated in this study (morphological decomposition, complex syntax, connectives, and anaphoric resolution) fall on a unidimensional factor of AL. The lack of clear boundaries to the construct make it difficult to measure and target for instruction. Specifically, the unidimensional nature of measures of AL suggests that more research on the connection between measurement of components of AL and student outcomes is needed. This type of research would suggest if the current measure has too broad or too narrow construct coverage to detect meaningful changes due to instruction. Future research may draw upon the extensive groundwork that describes the level of academic language in text (e.g., Biber & Conrad, 2009; Fang, Schleppegrell, & Moore, 2014) and the level of academic language produced by students of varying ages (e.g., Berman & Nir-Sagiv, 2009). Further study of AL in practical and theoretical contexts is needed to explore the continuum of the AL register and where there are distinctions that may be important for instruction.

When measuring written composition outcomes, writing prompts and scoring methods vary widely depending on the focus of the assessment (e.g., Scott, 2009). Therefore, the results of our study may or may not generalize to written composition outcomes that are defined differently. That is, the results may not be similar to other states’ tests of written composition. Our results are similar to another study of morphology, vocabulary, and syntax predicting written composition as measured by the Test of Written Language, a common nationally-normed assessment of written composition (Silverman et al., 2015). This suggests that our results may generalize outside of the State of Florida, but it depends on the expectations and goals for scoring writing in other contexts (Berman & Nir-Sagiv, 2009). The external validity of this study is further limited to groups performing slightly above average in written composition and for groups who have similar written composition expectations scored in similar ways (i.e., a handwritten informational or persuasive essay scored on a holistic scale). Results may be different if the prompt includes a passage, is a different genre, requires a typed response, or is scored using an analytic scale.

Conclusions

Good readers and writers flexibly interact with text at multiple levels to build meaningful associations within and across words, sentences, and text in academic content areas like social studies, science, and math (Englert et al., 2009; Silliman & Scott, 2009). The current study took a closer look at the AL aspect of what these good writers do and builds consensus that AL is a dimension of language that deserves attention for its role in written composition proficiency. Although we need to continue to clarify the actionable aspects of AL skills for improved instruction in schools, it is clear that the integrated nature of vocabulary, syntax, reading comprehension, and word recognition skills cannot be ignored.