Introduction

The ability to spell words correctly is an important part of successful written communication. Allocating a great amount of mental resources to the spelling of single words means that fewer resources will be available for higher-level aspects of writing (Treiman & Kessler, 2013). Indeed, poor spellers have been found to write fewer words and produce lower quality compositions than good spellers (Abbott, Berninger, & Fayol, 2010; Moats, Foorman, & Taylor, 2006). Hence, it seems important to explore what it takes to become a proficient speller. A common way do this is to search for longitudinal predictors of spelling ability in order to find out what distinguishes students who go on to become proficient spellers from those who do not. The present study investigated whether findings from this line of research could be replicated for the opaque Danish orthography. It also sought to extend previous findings by including predictors that have received little attention in the past, and by examining the power of predictors beyond the early phases of spelling development.

A widespread account of spelling development is that the focus of early development (typically in Grades 1 and 2) is the acquisition of phonological spelling knowledge, while the focus of later spelling development (typically beyond Grade 2) is the acquisition of orthographic and morphological spelling knowledge. This developmental pattern is the backbone of several models of stages or phases in the acquisition of spelling skills (e.g., Ehri, 1987, 2014; Frith, 1985; Gentry, 1982). For instance, Ehri’s (1987, 2014) theory of early literacy development distinguishes four overlapping developmental phases (prealphabetic, partial alphabetic, full alphabetic and consolidated alphabetic), each characterized by the predominant type of connections linking spellings of words to their pronunciations in memory. According to this theory, children in the early phases of spelling development (from prealphabetic to full alphabetic) learn how to represent the sound structure of words in a plausible but not necessarily conventional way (i.e., they acquire phonological spelling knowledge), while children in the later phases (from full to consolidated alphabetic) exhibit a growing memory for correct spellings, relying more and more on their knowledge of recurring orthographic patterns in the form of rime spellings, spellings of syllables and spellings of individual words and morphemes (i.e., they acquire orthographic spelling knowledge).

Stage theories have been criticized however, for oversimplifying the developmental patterns somewhat (e.g., Bourassa & Treiman, 2001, 2014; Walker & Hauerwas, 2006). It is argued that, although there is support for the general idea of a developmental shift from reliance on phonological knowledge to reliance on morphological and orthographic knowledge, even beginning spellers draw to some degree on orthographic and morphological knowledge when spelling words. Moreover, it is pointed out that within each domain (phonological vs. orthographic and morphological knowledge) children progress from simple to increasingly complex spelling patterns. Thus, apparently no sharp distinction between early and later spelling development can be made.

One may also wonder whether a sharp distinction can be made between phonological and orthographic spelling knowledge. Learning to use specific letters or letter sequences for specific words (word-specific orthographic knowledge) may be something completely different from learning to spell words phoneme by phoneme (phonological spelling knowledge). But, on the other hand, orthographic spelling rules may be of essentially the same kind as the more general phoneme-grapheme correspondences on which phonological spelling is based, only with a more restricted scope (e.g., applicable only to a specific set of words). In other words, orthographic spelling may be a mere extension of phonological spelling, building on the same cognitive foundations.

There is, however, some evidence indicating that phonological and orthographic spelling knowledge can be viewed as at least partially distinct constructs. For instance, Hagiliassis, Pratt, and Johnston (2006) administered a battery of phonological and orthographic knowledge tasks to children in Grades 3, 4, and 5. Factor analyses showed that the orthographic tasks loaded onto one factor, while the phonological tasks loaded onto a second factor. Cunningham, Perry, and Stanovich (2001) found similar results with a sample of children in Grade 1. Moreover, a few studies have shown that measures of orthographic knowledge contribute unique variance to word spelling skills above and beyond phonological knowledge (Arab-Moghaddam & Sénéchal, 2001; Conrad, Harris, & Williams, 2013; Rothe, Schulte-Körne, & Ise 2014). For instance, Conrad et al. (2013) investigated the concurrent prediction of spelling among 7–9 year old English-speaking children. A composite measure of orthographic knowledge (graphotactic knowledge and word-specific orthographic knowledge) explained a significant amount of unique variance in children’s word spelling skills (29 %) after controlling for age and phonological skills.

If the acquisition of orthographic spelling knowledge is based on skills different from those necessary for phonological spelling, then one would expect the power of longitudinal predictors to change over time as students become more proficient spellers and rely relatively more on orthographic knowledge. Presumably, such shifts in predictive patterns will be most easily observable in opaque orthographies, such as English or Danish, where multiple instances of inconsistent mappings between sounds and letters make the acquisition of orthographic spelling knowledge more important than in transparent orthographies, and where rates of development are likely to be slowed down (Caravolas, 2004). However, few longitudinal studies have examined predictors of spelling skills beyond Grade 2, and it is generally not very clear whether predictors of early spelling development play a similar role for later spelling development after accounting for the powerful autoregressive effects of early spelling skills. A major purpose of the present study is to contribute to the filling of this gap by examining the predictive power of preschool predictors (taken at the end of Danish kindergarten) for early versus later spelling development (Grade 2 vs. Grade 5).

In the following section we review the theoretical basis and the empirical evidence for a range of predictors of spelling skills which have been examined in previous longitudinal studies: phonological awareness, letter knowledge, verbal/phonological short term memory, rapid automatized naming, and paired associate learning. All of these were also included in the present study.

Predictors of spelling development

Phonological awareness (PA) and letter knowledge (LK) are generally held to be essential for understanding the alphabetic principle, i.e., for learning how phonemes and graphemes can be connected (cf. Juul, Poulsen, & Elbro, 2014). For spelling development, this is supported by several longitudinal studies that have found both PA and LK to be strong predictors of spelling skills in the early grades across orthographies (e.g., Caravolas, Hulme, & Snowling, 2001, 2012 [English]; Leppänen, Niemi, Aunola, & Nurmi, 2006 [Finnish]; Lervåg & Hulme, 2010 [Norwegian]; Furnes & Samuelsson, 2009 [English and Norwegian/Swedish]). Hence, PA and LK seem indispensable as predictors in longitudinal studies of early spelling development in any alphabetic orthography. In a study conducted in the relatively transparent German orthography, Landerl and Wimmer (2008) found that PA measured at the beginning of Grade 1 predicted later spelling skills in Grades 4 and 8, after controlling for nonverbal IQ, LK, and RAN-objects in Grade 1. Hence, it was of particular interest to observe whether PA would emerge as a long-term predictor of spelling in the current study of the more opaque Danish orthography.

Especially for learning to spell phonologically, one might expect measures of verbal/phonological short term memory (VSTM/PSTM) to play a significant role, because, in the absence of fully specified orthographic representations of word spellings, children have to remember and analyze the sound structure of words and syllables. Lervåg and Hulme (2010) found that VSTM measured with four different memory-span tests (colors, objects, digits, letters) 10 months before start of formal reading instruction (mean age 6;4 years) uniquely accounted for Norwegian children’s spelling skills 14 months later, after controlling for PA, LK and RAN. However, Caravolas and Snowling (2001) did not find VSTM (repeating lists of familiar monosyllabic words) measured four months after school entry (mean age 5;1 years) to be predictive of spelling skills 6 and 12 months later among English speaking children when controlling for PA and LK. Several factors such as task requirements, ages of school entry and assessments, and type of orthographies might lie behind these mixed results. In the present study it was of particular interest to observe, whether PSTM would turn out as a significant predictor in Grade 2 where children were expected to rely mainly on phonological spelling knowledge, and whether PSTM would lose power in Grade 5 where children were expected to rely more on orthographic spelling.

Rapid automatized naming (RAN), referring to the speed with which children can name objects, colors, digits, or letters has been included in several longitudinal studies of spelling skills across orthographies. In the following studies RAN was measured before the beginning of formal reading and spelling instruction. Caravolas et al. (2012) found that non-alphanumeric RAN (a composite score of RAN-objects and RAN-colors) was a significant predictor of spelling 10 months later across four languages varying in orthographic transparency (English, Spanish, Slovak, and Czech), after controlling for initial spelling ability, PA, LK and VSTM. Mean ages in the four groups ranged from 5;0–6;0 years when RAN was assessed. Georgiou, Torppa, Manolitsis, Lyytinen, and Parrila (2012) found that beyond the effects of LK, non-alphanumeric RAN (colors) predicted unique variance in spelling in Grade 2 among English and Greek children but not among children learning to spell in the highly transparent Finnish orthography. Mean ages in the three groups was around 5;6 years when RAN was assessed. Lervåg and Hulme (2010) found that non-alphanumeric RAN (a composite score of RAN-objects and RAN-colors) was a unique predictor of spelling 14 months later in Norwegian children above PA, LK and VSTM. The mean age was 6;4 years when RAN was assessed. Furnes and Samuelsson (2011) found that a latent construct of alphanumeric RAN (RAN-letters and RAN-digits) was a significant predictor of spelling in Grade 1 across languages (Norwegian/Swedish and English) after controlling for the autoregressive effect of Kindergarten literacy skills, vocabulary and PA. Mean ages in the two groups ranged from 6;2 to 6;9 years when RAN was assessed. Thus, across studies and across orthographies varying in orthographic depths RAN has proven to be a significant predictor of the development of spelling skills in the very early phases of literacy development.

According to some researchers, RAN taps into the ability to form orthographic representations (e.g., Conrad & Levy, 2007; Manis, Seidenberg, & Doi, 1999; Wolf & Bowers, 1999). When letter identification is slow (as reflected by poor RAN performance), orthographic representations of words or word parts cannot be stored efficiently. If these ideas are correct, one would expect RAN to be a better predictor of spelling in opaque (vs. transparent) orthographies, contrary to the findings of some of the studies summarized above. Another challenge comes from studies showing that RAN accounts for similar amounts of variance in word and nonword reading fluency although, presumably, the formation of orthographic representations is more important to word reading than to nonword reading. In a study by Moll, Fussenegger, Willburger, and Landerl (2009), RAN (digits and objects) only accounted for a modest amount of variance in word reading fluency (between 0.5 and 1.7 %) among German speaking children in Grades 3 and 4 after differences in nonword reading fluency were controlled. A similar result was found by de Jong (2011) with a sample of Dutch speaking children in Grades 1, 2 and 4. RAN (digits and letters) accounted for similar amounts of variance in standard tests of word and nonword reading fluency. A further challenge comes from studies showing that RAN is more closely related to reading than to spelling, although the formation of orthographic representations is likely as important for spelling as for reading. In a sample of Dutch speaking children from Grades 1 to 6, Vaessen and Blomert (2013) found that RAN (digit and letters) did not contribute to concurrent spelling performance in any of the grades. This contrasted to the strong contribution of RAN to performance in reading fluency. Finally, Moll et al. (2009) found that PA explained more variance in spelling than RAN, even though most spelling errors in German reflect a lack of orthographic rather than phonological knowledge.

As an alternative account of the relation between RAN and literacy, Moll et al. (2009) suggest that it has to do with the automaticity of orthography to phonology associations at the letter and letter cluster level. This account seems more compatible with the studies summarized above, but it is not quite clear whether one should expect RAN to be a predictor of spelling development beyond the early phases where basic association between sounds and letters are being formed. Thus, it was of special interest to observe whether RAN would emerge as a strong long-term predictor in the present study.

Since learning to spell is about learning associations between written and spoken language elements, one might expect measures of paired associate learning (PAL) to be predictive of spelling development. PAL involves establishing associations between stimulus items and response items in memory. These can be unimodal (e.g., visual–visual, verbal–verbal) or crossmodal (e.g., visual–verbal) in nature. Importantly, performance on a PAL task depends on successful learning of three separate components: the stimulus item, the response item, and the association between the two. Individual differences in performance may originate from processes operating at any of these three levels (Litt, de Jong, van Bergen, & Nation, 2013).

For reading ability, visual-verbal PAL with nonwords has been shown to be a unique concurrent predictor among English speaking children (e.g., Hulme, Goetz, Gooch, Adams, & Snowling, 2007; Warmington & Hulme, 2012). Moreover, Vellutino et al. (1996) showed that early variations in PAL ability (matching ideographs with common words) during kindergarten were predictive of variations in later reading skills among English speaking children in Grades 1 and 2. However, a limitation to this study was that phoneme awareness was not controlled for (Hulme et al. 2007). To our knowledge, only one longitudinal study of spelling development has included a measure of visual-verbal PAL, namely Lervåg and Hulme’s (2010) study of Norwegian children. Participants in this study had to associate three nonword names with pictures of either unfamiliar children, fantasy animals, or letters. These nonword PAL tasks did not predict later word spelling when LK, PA, VSTM, and RAN were controlled for. Norwegian is closely related to Danish (the two languages are mutually intelligible), but, unlike Danish, Norwegian is a relatively transparent orthography (Hagtvet, Helland, & Lyster, 2006). Hence, it was of special interest to observe whether the weak predictive pattern found by Lervåg and Hulme (2010) would be replicated in the current study of the more opaque Danish orthography.

One theoretical account of the PAL–reading relationship is that visual–verbal PAL taps a crossmodal associative learning mechanism involved in establishing orthography–phonology mappings (Hulme et al., 2007; Warmington & Hulme, 2012). The orthographic units involved in this mapping process may involve either lexical units (arrays of letters that identify words) or sublexical units (letters or letter clusters). In this view, visual-verbal PAL taps the efficiency with which novel associations can be created in memory between visual stimuli and their names (the verbal response). According to many theorists, reading and spelling are closely linked during development and rely on the same store of knowledge (Ehri, 2000; Perfetti, 1997), and studies of the interplay between reading and spelling indicate that orthographic representations of words acquired during exposure to print are used for both reading and spelling (Moll & Landerl, 2009). Hence, if PAL is tapping variations in establishing orthography–phonology mappings at both the lexical and sublexical level, one would expect PAL to be predictive of both spelling and reading skills from the earliest stages of literacy development.

Another theoretical account termed the verbal account by Litt et al. (2013) is based on studies of children with dyslexia. Findings from these studies suggest that deficits in visual–verbal PAL are explained by the verbal nature of the task rather than by its crossmodal demands (e.g., Mayringer & Wimmer, 2000; Messbauer & de Jong, 2003). More specifically, researchers have proposed that verbal learning is the crucial component of visual–verbal PAL because the strongest deficits are observed when response stimuli are nonwords, i.e., phonological forms that have not been learned prior to the test (Elbro & Jensen, 2005; Mayringer & Wimmer, 2000). This has recently been supported by the study by Litt et al. (2013) who found that only PAL tasks requiring verbal output correlated significantly with reading. It has been suggested that difficulties in learning new phonological forms, tapped by visual-verbal PAL with nonwords, may affect both reading and spelling acquisition via impaired storage of new phonological forms. These phonological forms are thought to serve as underpinnings of the letter patterns of words or parts of words (Mayringer & Wimmer, 2000). Hence, orthographic learning may be negatively affected by under-specified phonological representations, and this may be a particular problem in opaque orthographies, where writers often need to establish word-specific associations between (strings of) phonemes and their conventional spellings (Shahar-Yames & Share, 2008). The present study addressed whether visual-verbal PAL with nonwords would turn out as a long-term predictor of spelling among Danish children who are faced with multiple instances of inconsistent mappings between sounds and letters.

Spelling in Danish

As previously mentioned, Danish has an opaque orthography with many inconsistent mappings between phonemes and graphemes and with many complex graphemes (Elbro, 2006; Juul & Sigurðsson, 2005). This sets it apart from the orthographies of the other Nordic languages (e.g., Norwegian and Finnish) and makes it similar to English (Seymour, Aro, & Erskine, 2003). Computing phoneme-grapheme consistencies along the same lines as Kessler and Treiman (2001), Juul (2008) reported consistencies (on a scale from 0 to 1) of .672 for Danish vowels and .750 for consonants. These coefficients indicate that the correct spelling of a Danish phoneme is generally quite hard to predict. For English, Kessler and Treiman (2001) found an even lower vowel consistency of .529; they did not report consistencies for individual consonant phonemes.

Two longitudinal studies have found Kindergarten PA and/or LK to be predictive of spelling in Danish beginners (Frost, 2001; Juul, 2007). However, both studies terminated in Grade 2 when variance in spelling mainly reflected phonological rather than orthographic spelling skills. Measures of spelling were also included in two Danish intervention studies (Elbro & Petersen, 2004; Lundberg, Frost, & Petersen, 1988), but, unfortunately, these did not report correlations between Kindergarten measures and later spelling skills. Thus, although Danish is of theoretical interest as an orthography akin to English, relatively little is known about predictors of spelling skills in Danish students.

The present study

In the present study we asked whether the findings from other orthographies with respect to longitudinal predictors of early spelling development could be replicated for Danish. Specifically, we asked, whether PA, LK, RAN, and PSTM measured in Grade 0 (=Kindergarten) would predict spelling skills at the beginning of Grade 2. Furthermore, we asked two questions, which few previous studies have addressed, namely whether the predictors would also predict later phases of spelling development (from Grade 2 to Grade 5), and whether the addition of Kindergarten measures of PAL (both words and nonwords) would enable us to predict additional variance in spelling skills.

With respect to early spelling development (Grade 2 spelling), findings from other orthographies suggested that PA and LK would be the most salient predictors. For PAL (included mainly on the basis of studies of reading development), we expected that both PAL-words and PAL-nonwords would be positively correlated with early spelling skills. However, whether PAL would be a unique predictor of spelling above and beyond PA, LK, RAN, and PSTM was an open question.

With respect to later spelling development (Grade 5 spelling) we expected that, given the greater time span, all predictors would tend to lose power. However, we suspected that this tendency would be less strong for measures associated with the acquisition of orthographic spelling knowledge than for measures associated with phonological spelling knowledge. If measures were predictive specifically of the acquisition of orthographic spelling knowledge, we expected that they would remain significant even with controls for Grade 2 spelling levels.

Method

Participants and design

The present study was part of a longitudinal study conducted in Copenhagen, Denmark (cf. Elbro, de Jong, Houter, & Nielsen, 2012). The main focus of the study was the development of word reading accuracy and speed in Grades 1 and 2, but measures of spelling (the focus of this report) were included at the beginning of Grade 2 and in a follow-up at the beginning of Grade 5. Predictor measures were taken at the end of Grade 0 (the Kindergarten grade). Results from the study have previously been reported in three articles (Elbro et al., 2012; Poulsen, Juul, & Elbro, 2012; Juul et al., 2014), none of which shared the present focus on development of spelling skills.

In this article we report results from 140 students who participated at all three assessment points (Grade 0, Grade 2 and Grade 5). The students came from eight classes from four schools in Copenhagen. Nine students (6 %) were bilingual, but all but one listed Danish as their preferred language. Sixty-seven (48 %) were girls. Mean ages were 6;10 years (SD = 4 months) at the end of kindergarten; 8;3 years at the beginning of Grade 2; and 11;3 years at the beginning of Grade 5.

The original sample was somewhat larger (187 students in Grade 0, and 174 in Grade 2), but not all participants could be re-tested in Grade 5, either because of moving, absence on the day of testing, or because no signed consent from the parents was handed in. On the spelling test administered at the beginning of Grade 2, no significant difference was found between the 140 students who remained in the study, and the 34 students who only participated in Grade 2.

Spelling skills were not assessed in Grade 0, but results on a test of word reading accuracy indicated that initial literacy skills were quite limited; on a list of 32 items, 73 % of the 140 participants were unable to name more than two words correctly, at most. This came as no surprise, as Danish students do not receive formal reading instruction in Grade 0. However, games and activities designed to stimulate phonological awareness and letter knowledge are common at this grade level.

Scores on the spelling tests in Grades 2 and 5 were found to be close to the reported norms for the tests (Juul, 2012). Thus, the sample appears to be typical for Danish students, at least with respect to spelling levels.

Procedure

All testing was done by trained assistants and took place in a quiet room at the participants’ school or, for the group-administered tests, in the participants’ own classrooms.

Measures

Preschool-measures

Phoneme deletion

In this test (adapted from Elbro, Borstrøm, & Petersen, 1998), the participants were presented with a word spoken by the examiner and asked to say what was left when a given phoneme was deleted, for example, What is left if you remove [m] from mand (‘man’)? Expected answer: and (‘duck’). The phonemes to be deleted were initial (9 words), medial (5 words), or final (4 words). Up to six practice trials were given to each participant. Testing was stopped if the participant made four incorrect responses in a row. The score was the number correct. Cronbach’s alpha was .91 in the full sample.

Phoneme matching

In this group-administered test (from Borstrøm & Petersen, 2006) participants were asked to select one of four pictures that had the same initial phoneme as a target picture. The test has two parts with 10 items each; one part with vowels and one part with consonants as target phonemes. Two practice items were given for each part of the test. The score was the number correct. Cronbach’s alpha was .80 in the full sample.

Phonological short term memory

In this test participants were asked to repeat 19 nonsense words consisting of one to five syllables (e.g., skug, ki-bra-di-ka-se). The score was the number correct. Cronbach’s alpha was .72 in the full sample.

Letter naming

In this test (from Elbro et al., 1998), the participants were asked to name each of the 29 uppercase letters in the Danish alphabet presented in a random order on a sheet of paper. The score was the number correct. Cronbach’s alpha was .92 in the full sample.

Rapid automatized naming with digits and objects

Previous studies have used a range of different RAN tasks (cf. the introduction). In the present study both an alphanumeric task (digits) and a non-alphanumeric (objects) were used. Digits were preferred over letters for the alphanumeric task, both because it was unclear whether all Grade 0 students would have sufficient letter knowledge, and also to avoid letter knowledge as a confound when interpreting correlations between RAN and literacy skills. In the digit section of the test, the participants named five rows of 10 digits (digits 1–5) presented in a fixed random order. In the objects section, the participants named four rows of eight objects (sol ‘sun’; saks ‘scissors’; hjerte ‘heart’; and blomst ‘flower’). The score was the number of correctly named items per second. The correlation between RAN-digits and RAN-objects in the present sample was .67.

Paired associate learning with nonwords

The participants had to learn non-familiar names of three non-familiar cartoon animals (sput, laf and ky). Initially, two of the animals were introduced along with a small story. Their names were repeated numerous times. The participants repeated the names and answered simple questions about the story. The purpose of these questions was to get the participants to repeat the names. The participants were then presented with the two animals in varying orders in separate trials, until the animals were named correctly on three successive trials If the participants made mistakes they were corrected and asked to repeat the names. When the criterion was reached, a new animal was presented in the same way as the first two. Naming trials with three animals then continued until they were named correctly three times in a row. The task was terminated if the criterion was not reached within 15 trials. If testing terminated because the criterion was reached, the remaining trials were scored as correct. The score for the task was the number of correctly named animals in the 15 trials. The task was modeled after Elbro and Jensen, 2005, but differed in some parts; in the task used in the present study, the participants had several opportunities to repeat the names before the first trial, whereas in the study by Elbro and Jensen, participants only repeated each name once before the first trial. Moreover, in the study by Elbro and Jensen, human faces were used rather than animals, and participants were introduced to all names in the first trial. Additionally, compared to the tasks used in the study by Lervåg and Hulme (2010), the task included more separate trials (15 vs. 10); also, the nonwords to be learned in the Lervåg and Hulme study were generally more complex, e.g., CCVCV.

Paired associate learning with words

The participants had to learn four real names (Nina, Lone, Jeppe and Lasse). These names are all frequent in Danish; however it is unlikely that their spellings were known to the participants who had very limited literacy skills (cf. the participants section above). The procedure was similar to the nonword task except that three cartoon animals were introduced to start with and this time without a story. The task was terminated if the criterion was not reached within 15 trials. If testing terminated because the criterion was reached, the remaining trials were scored as correct. The score for the task was the number of correctly named animals in the 15 trials. The task was modeled after Elbro and Jensen (2005), but differed it some respects, as described above. The correlation between the two PAL-tasks in the present sample was .38 (cf. the comments on Table 1).

Table 1 Descriptive statistics for predictor measures (Grade 0)

Spelling (Grade 2 and Grade 5)

Spelling skills were assessed with age-appropriate standardized group-administered tests of word spelling. Staveprøve 2 (‘Spelling Test 2’, recommended for students from Grade 2 to 4; Juul, 2012) was used in Grade 2, and Staveprøve 3 (‘Spelling Test 3’, recommended for students from Grade 4 to 6; Juul, 2012) was used in Grade 5. Strong correlations between the two tests have been found for fourth-graders who took both tests either in September (r = .84; N = 298) or February (r = .83; N = 528; standardization sample data owned by the publishers). Responses were scored both for correctness and for phonological plausibility.

Staveprøve 2 has 17 items which target phonological spelling skills (e.g., several items feature two-consonant onsets, and some sounds have to be written with a complex grapheme in order to be phonologically plausible, such as [ŋ] = ng). For correct spelling some orthographic knowledge is required, too (e.g., knowledge that the onset [sb] is spelled sp rather than sb; that certain vowels spellings depend on the length of the vowel; and that certain consonants are doubled after short vowels). Cronbach’s alpha in the standardization sample was .91 (Juul, 2012).

Staveprøve 3 has 36 items which target orthographic spelling skills (e.g., many items feature vowel spellings that are not predictable from phonology, silent letters, or suffixes that need to be identified as such in order to be spelled correctly such as the notoriously difficult present tense marker -er; Juul & Elbro, 2004). Cronbach’s alpha in the standardization sample was .94 (Juul, 2012).

Results

Predictor measures

Descriptive statistics for the predictor measures are given in Table 1. The results indicate that phoneme deletion was a challenging task for the participants, while phoneme matching was fairly easy. In the subsequent analyses, these two measures are combined (mean z-scores) into a single measure of phonological awareness (PA). The correlation between the two was only moderate (r = .36), but, presumably, this was due to the different distributions (a floor tendency in the deletion task and a ceiling tendency in the matching task). Likewise, the two RAN measures (r = .67) were combined, in order to simplify analyses and maximize reliability. The remaining measures were entered separately in the subsequent analyses. Note, however, that many participants scored near ceiling on the test of LK and on PAL-words; contributions to the prediction of spelling skills may be underestimated because of the limited sensitivity of these measures.

The correlations among the measures are given in Table 2. The correlation between the two PAL measures was only low to moderate (r = .38), but it is comparable in size to the correlations between the PAL measures reported in the study by Lervåg and Hulme (2010). The correlations between RAN and PAL measures were weak and non-significant, suggesting that distinct constructs were tapped. All measures correlated significantly with PA, and, as one might expect, both PAL measures correlated significantly with PSTM.

Table 2 Correlations among predictor measures (Grade 0)

Spelling measures

On the Grade 2 spelling test, the students spelled 4.1 of the 17 items correctly (SD = 3.2) on average. This rather low score is typical of the age group (as mentioned in the participants section above); the test is intended for students all the way up to Grade 4, and therefore features relatively difficult words. As one would expect at this level, students did not always spell the words in a phonologically plausible way either (M = 11.6; SD = 4.9). Hence, low scores can be due to limitations in either phonological or orthographic spelling skills, or both.

On the Grade 5 spelling test, the students spelled 22.3 of the 36 items correctly (SD = 7.9). Here, the participants’ spellings were nearly always phonologically plausible (M = 32.0; SD = 5.4). Hence, individual differences in the Grade 5 spelling test were primarily reflections of differences in orthographic spelling skills.

In the relatively few cases where spellings were not phonologically plausible, the erroneous spelling often reflected a common reduced pronunciation, e.g., leaving out the unstressed middle syllable of the present participle syng e nde ‘singing’ [ˈsøŋənə > ˈsøŋnə]. At this level, it seems likely that spelling knowledge is an important source of knowledge of distinct pronunciations, rather than vice versa; students may not be aware that the distinct pronunciation of syngende has three syllables, because they are poor spellers. In other words, phonologically implausible spellings may not reflect a lack of phonological spelling ability per se.

Predicting early versus later spelling

The correlation coefficients between the predictor measures and the spelling measures appear in Table 3. All predictors were significantly associated with spelling in both grades and the two spelling measures correlated moderately with each other.

Table 3 Correlations among predictor measures (Grade 0) and spelling measures (Grades 2 and 5)

We ran a series of z-tests of dependent correlations (Meng, Rosenthal, & Rubin, 1992) to investigate whether any of the correlations between each of the predictor measures and spelling in Grade 2 and Grade 5, respectively, changed significantly. The correlation between PA and spelling weakened significantly from Grade 2 to Grade 5 (Z = 2.19, p < .05) while the correlation between PAL-nonwords and spelling got significantly stronger from Grade 2 to Grade 5 (Z = 2.43, p < .05). For the other measures, the differences between Grade 2 and Grade 5 coefficients were not significant, and the expected weakening tendency was found only for PSTM and RAN. Furthermore, when compared to PAL-words, PAL-nonwords was significantly more strongly correlated with spelling in Grade 5 (Z = 2.17, p < .05) However, the two PAL measures were equally correlated with spelling in Grade 2 (cf. Table 3).

Next, we conducted two multiple regression analyses to test whether the Grade 0 measures would contribute uniquely to the prediction of early spelling in Grade 2 and later spelling in Grade 5, respectively. In both analyses the six predictors were entered simultaneously as independent variables. For each predictor the squared semipartial correlation was calculated. This correlation expresses the unique contribution of each predictor to the total variance of the dependent variable (Tabachnick & Fidell, 2014, p. 208). These regression analyses allowed investigating the predictive patterns for early and later spelling.

Table 4 shows the results of the two multiple regression analyses with early spelling in Grade 2 and later spelling in Grade 5 as the dependent variables. The table displays the standardized regression coefficients (β), the squared semipartial correlations (sr 2) and the total amount of variance explained (R 2). In total, the six predictors explained 43 % of the variance in early spelling in Grade 2. Only PA, PSTM, and RAN explained unique variance above and beyond the other variables.

Table 4 Summary of multiple regression analyses for variables predicting early spelling in Grade 2 and later spelling in Grade 5

For Grade 5 spelling, the six predictors explained 33 % of the variance. In this model, only RAN and PAL-nonwords explained unique variance beyond the other variables. The two models are evidently distinct, and only RAN made a unique contribution to both.

Finally, we examined whether RAN and PAL-nonwords remained significant predictors of Grade 5 spelling if Grade 2 spelling was taken into account. In other words, we asked whether RAN and PAL-nonwords could be viewed as predictors of developments in spelling that took place between Grades 2 and 5. The correlation between Grade 2 and Grade 5 spelling was fairly strong (r = .56, cf. Table 3), but as can be seen in Fig. 1 (a plot of the spelling scores in Grades 2 and 5, with vertical and horizontal lines representing the means), some students obtained higher scores in Grade 5 than one would expect from their relatively low scores in Grade 2 (the circles appearing in the upper left corner of the scatterplot). Thus, not all variance in Grade 5 spelling was explained by Grade 2 spelling. On the other hand, students who started out with relatively high scores in Grade 2 seem to have continued their course of development, and nearly all obtain scores above average again in Grade 5.

Fig. 1
figure 1

Scatterplot of spelling scores in Grade 2 and Grade 5

To shed light on this question, a hierarchical multiple regression analysis with spelling in Grade 5 as the dependent variable was conducted (cf. Table 5). At step one early spelling in Grade 2 was entered to control for the effect of early spelling skills. Then, RAN and PAL-nonwords were entered as predictor variables at the second and third step; the remaining predictor variables were left out because they did not explain unique variance in the previous model. The analysis showed that PAL-nonwords did survive as a unique predictor when Grade 2 spelling was controlled. RAN, however, did not. In total, 42 % of the variance in Grade 5 spelling was explained.

Table 5 Hierarchical multiple regression analysis for variables predicting later spelling in Grade 5 controlling for early spelling in Grade 2

Discussion

In the present study we investigated to what extent a range of measures taken at the end of Kindergarten predicted spelling skills in Danish children in an early phase (beginning of Grade 2) and in a later phase (beginning of Grade 5) of development. For the early phase, we found that PA, RAN and PSTM were unique predictors, whereas LK and PAL with words and nonwords were not. For the later phase, the pattern of prediction was clearly different, with RAN and PAL-nonwords being the only significant predictors. When controlling for Grade 2 spelling levels, PAL-nonwords still explained a significant and relatively large share of the variance (R 2 = .09), suggesting a specific link between this measure and the acquisition of spelling skills beyond Grade 2. Overall, the results suggest that the acquisition of orthographic spelling knowledge (occurring mainly in the later phases of spelling development) is partly based on skills different from those necessary for phonological spelling development (in the earlier phases).

The contributions of PA and RAN to early spelling replicated findings from previous studies of other orthographies (e.g., Caravolas et al. 2001, 2012; Georgiou et al. 2012; Lervåg & Hulme, 2010). The significant contribution from RAN seems to be in accordance with the suggestion put forward by Moll et al. (2009) that RAN is related to the automaticity of orthography to phonology associations at the letter and letter cluster level rather than to the acquisition of orthographic spelling knowledge.

The finding that RAN was not a predictor of later growth in spelling is in accordance with earlier findings (e.g., Furnes & Samuelsson, 2011; Lervåg and Hulme, 2010). Lervåg and Hulme suggested that RAN’s power as a concurrent predictor of spelling skills above Grade 2, as found in some correlational studies (e.g., Moll et al. 2014; Savage, Pillay, & Melidona, 2008), reflects the link between RAN and individual differences in earlier stages of spelling development. This interpretation is also in accordance with the assertion by Moll et al. (2009) that RAN reflects the automaticity of orthography to phonology associations at the letter and letter cluster level. By contrast, if RAN taps into the ability to form word-specific orthographic representations (e.g., Conrad & Levy, 2007; Manis et al., 1999; Wolf & Bowers, 1999), one would expect RAN to be specifically related to the later phase of spelling development. The findings of the current study did not show such pattern.

The unique contribution of PSTM to the prediction of early spelling may simply reflect the fact that relatively heavy demands are placed on PSTM when children still struggle to analyze the sound structure of words during a dictation task. Still, the finding suggests that measures of short term memory are important as controls in studies of early spelling skills.

The relevance of LK as a spelling predictor was not confirmed in the present study. Note, however, that LK shared substantial variance with PA (r = .47), and that the LK measure lacked sensitivity in the upper range; many participants already knew most of the alphabet when we tested them at the end of Kindergarten.

The inclusion of PAL in the present study did not improve the prediction model for early spelling development although, as expected, both PAL-words and PAL-nonwords correlated significantly with spelling skills. For later spelling however, PAL-nonwords (but not PAL-words) was a unique predictor. The fact that PAL-nonwords gained predictive power from the early to the later phase of spelling development is, perhaps, the most remarkable finding of our study and seems to contrast with the finding of Lervåg and Hulme (2010) that PAL-nonwords did not predict growth in spelling skills from Grade 2 and onwards to Grades 3 and 4. The contrast may be due to differences in the transparency of the Norwegian and Danish orthographies; the opaque Danish orthography requires orthographic learning to a much higher degree than the more transparent Norwegian orthography, especially in the later phases of spelling development. The contrast may also be due to differences in task demands. The three PAL-nonwords tasks used in the Norwegian study were clearly more difficult than the task used in the present study; across tasks the participants correctly named 40 % of the items in the Norwegian study which is much lower than the 77 % correctly named items found in the present study. Compared to the participants in the Norwegian study, the participants in the present study had more opportunities to repeat the nonwords before the first trial, i.e., they had better opportunities to establish representations of the new phonological forms before associating them with the visual stimuli. Moreover, they completed more trials (15 vs. 10). Together, these differences suggest that our PAL-nonwords task was more sensitive to differences in verbal learning of new phonological forms among children who performed in the lower range. This interpretation seems to fit with the theoretical position that verbal learning is the critical factor behind the relationship between PAL-nonwords and literacy skills (e.g., Litt et al. 2013)–a position which also accommodates the finding that PAL-nonwords, but not PAL-words, was a unique predictor of later spelling.

To explain why verbal learning abilities (as tapped by PAL-nonwords in Kindergarten) should be specifically related to the development of orthographic spelling skills, we speculate that children who have difficulties learning new phonological forms also have difficulties extracting the phonological forms that correspond to conditional or word-specific spelling patterns. When spelling words akin to the items from the Grade 5 spelling test, children cannot rely on simple phoneme-grapheme correspondences but have to draw on knowledge of recurring orthographic patterns. In order to remember such spelling patterns, children must link them to phonological forms below the word level. Well-specified representations of these forms may play an important supportive role as underpinnings for the crucial letter patterns.

This interpretation however, rests on the reliability of the PAL-nonword task used in the current study. As discussed below, the PAL-measures may have had limited reliability.

Limitations

Since different tests of spelling skills were used in Grade 2 and Grade 5, we were evidently not predicting variation on the exact same measure at the two time points. The differences in the predictive patterns for the early and later phases of spelling development may reflect differences between the two tests. As the same word material would not be equally sensitive to differences in spelling skills for children in Grade 2 and Grade 5, it may be more accurate to say that the present study investigated the achievement of spelling knowledge children are expected to have acquired at different phases in their spelling development. However, as mentioned in the method section, strong correlations have been found for fourth-graders taking both tests, indicating that the two tests do, to a large extent, measure the same, or strongly related, skills.

Because many students obtained relatively low scores on the Grade 2 spelling test, the predictive patterns observed could reflect the limited sensitivity of this measure in the lower range. For participants with low scores, a measure based on the number of phonologically plausible (rather than correct) spellings was more sensitive. However, when we repeated the regression analyses above with this alternative Grade 2 spelling measure, the predictive patterns found were essentially the same.

A major limitation of our study was that measures of morphological awareness were not included among the predictors despite the relevance of morphological knowledge for spelling (e.g., Boulware-Gooden, Joshi, & Grigorenko, 2015; Bourassa & Treiman, 2014). This was due to the fact that the study was part of a larger study focusing primarily on the development of accuracy and speed in reading, and, again, a concern that a too demanding test battery in Grade 0 would cause participants to withdraw from the study. It seems likely that such measures would have contributed to the prediction of spelling development, especially in the later phases where awareness of inflectional morphemes have been found to correlate with spelling skills (e.g., Juul, 2005). Also, it is possible that such measures would have shared variance with our PAL-measures.

Among the predictor measures included, some had somewhat limited sensitivity. First, there was some degree of ceiling effect on LK, PAL-words and phoneme matching and a degree of floor effect on phoneme deletion. For phoneme awareness, using a combined measure of the phoneme matching and phoneme deletion scores solved this problem. However, for LK and PAL-words, the ceiling tendencies may indeed have reduced their predictive power. Especially in light of the high level of LK observed, it would have been useful to include a test of initial spelling skills in the Grade 0 battery. In fact, such a test was considered, but not included; we were concerned that the test battery could be too time consuming and demanding for children who were not used to being tested, and lead to negative attitudes towards continued participation in the longitudinal study.

A final possible limitation to be considered is the reliability of the PAL tasks. Measures of internal consistency are probably not informative reliability measures for PAL tasks since they really consist of only one item (a fixed set of words to be learned; Poulsen et al. 2012). However, the correlations with other measures suggest that our measures were at least comparable to those used in previous studies. The correlation between PAL and PA was in the same range as found in some studies (Hulme et al. 2007; Litt et al. 2013; Windfuhr & Snowling, 2001), but lower than in other studies (de Jong, Seveke, & van Veen, 2000; Lervåg, Bråten, & Hulme, 2009). Likewise, the correlation between PAL and RAN was in the same range as found in some studies (Lervåg & Hulme, 2010; Litt et al. 2013) but lower than in other studies (Lervåg et al. 2009; Warmington & Hulme, 2012).

Conclusions

The ambition of the present study was to contribute to the understanding of the cognitive foundations of both the early and later phases of spelling development. Ultimately, we hope that such studies can pave the way for improved spelling instruction. The present study raises particular questions about how children can be helped to acquire orthographic spelling knowledge more easily. Factors other than basic PA and LK seem to be of importance, and especially students with poor verbal learning abilities may be in need of explicit instruction. In the present study, general measures of spelling skills were used. To further test the hypothesis that PAL is specifically related to the acquisition of orthographic spelling knowledge, it would be useful to include measures of specific types of orthographic knowledge (e.g., word-specific knowledge; knowledge of graphotactic patterns; knowledge of conditional spelling patterns). Moreover, it would be useful to include multiple tasks to assess PAL, both to ensure reliability and to tease apart the role of different task demands. In particular, it may be of importance to distinguish between the verbal learning and the association part of the task.