Introduction

During recent years much research has been dedicated to comparing the acquisition of written word processing mechanisms in deep and shallow orthographic systems (Seymour, Aro, & Erskine, 2003; Snowling & Hulme, 2006; Treiman & Kessler, 2005; Ziegler et al., 2010). In shallow systems, orthography reflects surface phonology with a high level of consistency. This is not the case in deep systems, in which orthography simultaneously conveys phonological and morphological information. For example, in French, a deep system, il chante (he sings) and ils chantent (they sing) are phonologically identical: /il ∫ɑ̃t/. Orthography, however, includes plural marks: s and nt, which are unsounded. As a consequence, the French-speaking speller has to use his morpho-syntactic knowledge in order to correctly write s and nt when necessary (Fayol, Hupet, & Largy, 1999; Pacton & Fayol, 2003; Sénéchal, Basque, & Laclaire, 2006). Interest in comparing systems has increased because previous models of reading and spelling were mainly based on studies carried out in English, which possesses the deepest orthographic system of all European languages (Seymour et al., 2003). For this reason, doubts might be formulated about the advisability of generalizing English-based models to other systems (Share, 2008). The most explicit formulation of a theory that word processing does indeed differ in deep and shallow systems is the Orthographic Depth Hypothesis (ODH), which supposes that word identification in shallow orthographic systems is exclusively based on phonological prelexical computation (Frost, 2005; Katz & Frost, 1992, 2001; Ziegler & Goswami, 2005). This strong version of the ODH is sometimes substituted by a lighter version, in which orthographic knowledge also plays a role in word identification; but even under this softer version of the ODH, the role of the phonology remains far more important in shallow than in deep systems. In the context of the ODH, it might be considered that phonological prelexical computation in deep orthographic systems is insufficient for identifying most words, and, as a consequence, orthographic representations of words are stored in the orthographic lexicon to allow word identification. However, this is not the case in shallow systems.

Studies comparing shallow and deep orthographic systems have clearly established that learning to read is easier in shallow systems. In a comprehensive study, comparing English with twelve other European languages, Seymour et al. (2003, p. 167) concluded that “readers of English require two and a half or more years of literacy learning to achieve a mastery of familiar word recognition and simple decoding, which is reached within the first year of learning in a majority of European languages”. These results confirm the findings of a relatively plentiful literature available, including comparisons of Turkish with English (Öney & Goldman, 1984), German with English (Wimmer & Goswami, 1994), Dutch with English (Patel, Snowling, & de Jong, 2004), Spanish with Portuguese (Defior, Martos, & Cary, 2002), Spanish with French and English (Goswami, Gombert, & Barrera, 1998), English, Hungarian, Dutch, Portuguese and French (Ziegler et al., 2010), and Welsh with English (Spencer & Hanley, 2003). Neurological studies comparing the Italian and English languages have shown that reading in the former produces a greater activation of brain areas associated with phonological processing. In contrast, the areas involved in semantic processing and word naming were more strongly activated when reading English (Paulesu et al., 2001). The contribution of the orthographic lexicon to word reading would merely be masked by the rapidity of sub-lexical phonological word processing in a shallow system. Similar conclusions were reached by Dehaene (2007).

The aim of the present study was to compare the development of spelling mechanisms in French and Spanish. The above-mentioned studies confined themselves to comparing reading acquisition, so that consistency of the grapheme-to-phoneme translating procedure strongly favours shallow orthographic systems. In the present study, spelling development will be considered, rather than reading, because spelling requires using the specific features of words, which are not indispensable in word reading. In Spanish, for example, the phoneme /b/ might be spelled b or v. It might be argued that a Spanish learner will identify a word when it is correctly spelled (víbora) and when it is not (bíbora). In contrast, in order to spell víbora correctly its complete orthographic representation is necessary. Similarly in French, the phoneme /ɑ̃/ is sometimes spelled an, and sometimes en, but these two graphemes are almost always pronounced /ɑ̃/. This creates an asymmetry allowing safe reading using grapheme-to-phoneme processing, but requires lexical knowledge for correct word spelling. For this reason, spelling development is a more powerful tool for examining the development of the orthographic lexicon.

Our reason for considering French as an example of a deep orthographic system deserves an explanation. French orthography is relatively consistent when it comes to reading but extremely inconsistent in the case of spelling. A set of grapheme-to-phoneme translation pairs allows one to read most French words, but word spelling using phoneme-to-grapheme pairs (PGPs) is considerably less efficient. A computer simulation study aimed at evaluating the productivity of a set of PGPs in a corpus of 3,724 current French words (function words excluded) has shown that about 50 PGPs allow the correct transcription of 88 % of phonemes but only about 50 % of words (Véronis, 1988). A procedure based on PGPs permits the correct spelling of not more than half of French words. Inevitably, in order to spell French words correctly, the child must possess and use a number of linguistic abilities, notably lexical, that go far beyond the phoneme-to-grapheme translation processes used in the Véronis’s simulation. No such evaluation of the productivity of a set of PGPs to read and spell Spanish words is available. However, it is easy to argue that the number of PGPs is lower in Spanish than in French and that their productivity is considerably greater. Vowel spelling is a good example of the difference between Spanish and French systems. Spanish has only five vocalic phonemes which are spelled using fives graphemes (the unique exception is the phoneme /i/ which is spelled i most of the time but can sometimes be spelled y, usually in word endings). French, on the contrary, has no fewer than 14 vowels, some of which can be spelled in two, three or even more ways (for example, /ε/ can be spelled e, è, ai, ei; /o/ → o, au, eau; /ɛ̃/ → in, ain, ein, and so on. Remember that each syllable has at least one vowel, so that the number of rules required for spelling French words is considerably greater than for Spanish words, and their productivity is far lower. We mentioned above that French is relatively consistent at reading level but the vowel spelling complexities just described may produce reading difficulties (for example, the word ananas (pineapple) can be read correctly /a na na/ but can also be decoded /ɑ̃ ɑ̃ a). Besides, French morphology is more thoroughly represented in written than in spoken language, since it makes use of numerous orthographic marks which have no phonological counterpart, as is the case with the final s of plurals of nouns and adjectives, and final nt of verbs. In addition to morphologically motivated word endings, which makes difficult the learner’s task (Fayol et al., 1999; Pacton & Fayol, 2003; Sénéchal et al., 2006), some other final consonants might be sounded or not (e.g., p in cap /kap/ and loup /lu/ [cap and wolf], c in hamac /amak/ and tabac /taba/ [hammock and tobacco], l in fusil /fyzi/ and persil /pεRsil/ [gun and parsley], etc.), and in some cases their pronunciation depends on the context (dix pronounced /dis/ in “tu as dix” [you have ten], /diz/ in “dix arbres” [ten trees], and /di/ in “dix voitures” [ten cars]. It is obvious that these final letters, which are unsounded or pronounced differently depending on the context, complicate the reader’s decoding task. To conclude, the fact that French is more consistent at reading than at spelling level does not means that it is as consistent as Spanish even at reading level.

To summarize, the prediction derived from the conceptual frame of the ODH is that the orthographic lexicon should develop more rapidly in French than in Spanish because the phoneme-to-grapheme translation system is sufficient to spell most Spanish words. French spellers, on the contrary, must rely on the orthographic lexicon. A different prediction, however, might be formulated on the basis of the STH, which relates the elaboration of the orthographic representation of words with successful word identification by using pre-lexical phonological processing. Repeated identification of a word progressively generates its representation, which contains specific orthographic information (Share, 1995, 1999). Thus, orthographic lexicon develops as a direct consequence of word identification using sub-lexical grapheme-to-phoneme processing mechanisms. It might be hypothesized that the orthographic lexicon would develop more rapidly in a shallow than in a deep orthographic system because grapheme-to-phoneme processing is easier. Contrary to this prediction, Share (2004) observed that first-graders reading in pointed Hebrew, a quite consistent system, did not show any relationship between the repeated reading of new words and storing their orthographic representation. Several studies have shown that the repeated decoding of new words does generate orthographic representations of them (Bowey & Muller, 2005; Cunningham, Perry, Stanovich, & Share, 2002; de Jong & Share, 2007; Nation, Angell, & Castles, 2007; Reitsma, 1983a, b; Share & Shalev, 2004). These studies were carried out in English and Dutch, which are deeper than Hebrew. Share concluded that deep orthographic systems promote the storage of orthographic representations of words to a greater extent than shallow systems, a notion that has recently been defended by Lété, Peereman and Fayol (2008) in a comprehensive study of the acquisition of French spelling. It is difficult, however, to reconcile the notion that the easy reading of words does not promote the storage of orthographic information with the basic claim of the STH, which maintains that successful word decoding generates orthographic word representations. In the present study, the development of word frequency effects in spelling is compared in French and Spanish learners. If the ODH is correct, it might be expected that word frequency effects will be stronger and earlier in French than in Spanish. If the results point to the contrary, they can only be interpreted as favouring the classical self-teaching notion, which claims that orthographic representations result from successful word identification.

In order to explore this issue, participants were asked to spell words of high and low frequency, containing inconsistent PGPs presented in their Non-Dominant caseFootnote 1 (e.g., the PGP /s/ → c is inconsistent in cigarette—/si aRεt/, because /s/ can take different graphemic values (s, c, t, ss, sc), and Non-Dominant because French spellers usually adopt s to spell /s/). It is obvious that taking inconsistent PGPs in their dominant version would be inadequate because correct spelling would be produced using either orthographic representations of words or the phoneme-to-grapheme dominant version of the rule. Inconsistent PGPs are considerably more frequent in French than in Spanish, although their incidence in Spanish is not negligible. For example, the phoneme /b/, which, as mentioned above, is quite frequent, can be spelled b or v; and /χ/ followed by /e/ or /i/ can be spelled j or g, etc. In order to correctly spell words containing Inconsistent-Non-Dominant (IND) items, participants must use their orthographic lexicon. Besides the IND items, the participants were asked to spell high and low frequency words containing totally consistent PGPs, but involving contextual constraints (Consistent Context Dependent pairs: CCD). For example, the phoneme / / followed by /e/ and /i/ is, in Spanish as well as in French, consistently spelled gu rather than the more straightforward g. The interest of including this notion in a cross-linguistic perspective arises from recent discussions about the use of surrounding information to translate phonology in orthography (Treiman, Kessler, & Bick, 2002). Accordingly, Lété et al. (2008) reported that French children are influenced by syllabic consistency beyond phoneme grapheme consistency, suggesting that they exploit surrounding information to spell. These authors hypothesized that “more consistent orthographies, such as Italian or Spanish, might lead spellers to pay less attention to the context when converting orthographically unambiguous sounds” (p. 972).Footnote 2 In the present study, two PGPs, both of which presented exactly the same characteristics in French and Spanish, were used to test this hypothesis.

The first CCD item was the phoneme / / followed by /e/ or /i/ which, as mentioned above, is always spelled gu (henceforth / /e-i → gu) like the French word guerre and the Spanish equivalent guerra (war). In both languages / /e-i → gu has as a competitor / / → g, typically used when the speller fails to take the context into account. This error consists of spelling *gerre in French (pronounced / /), and *gerra in Spanish (pronounced /χ/). The resemblance between languages is so strong that the syllables /χ/—/χi/ in Spanish and / e/—/ i/ in French, can be spelled with the grapheme j in both cases. The second CCD item was /n/p-b → m (i.e., champion and campeón in French and Spanish, respectively). A nasal vowel in French, /ɑ̃/ as well as the phoneme /n/ in Spanish, followed by p or b is always spelled m rather than n, which is the typical error.

Recent studies of spelling mechanisms have considered the contribution of two sources of consistency, feed forward (phoneme-to-grapheme) and feed backward (grapheme-to-phoneme) separately (Lété et al., 2008). The CCD items offer the possibility to examine the effects of feed backward consistency. The problems encountered by Spanish and French spellers with words like guerraguerre might be due to the strong association that exists in both languages between the grapheme g and the phoneme / /. Similarly, the French nasal vowels might activate the grapheme n more strongly than m and produce errors like cha n pion instead of cha m pion. The same tendency probably holds in Spanish, too.

Method

Participants

The French sampleFootnote 3 consisted of 74 monolingual children attending normal classrooms at the same elementary school in Belgium. The group included 18 children tested at the beginning of 2nd grade (Mean age: 7 year and 2 months; range: 6.9–7.7), 22 at the end of 2nd grade (M = 7.9, range: 7.5–8.2), 18 in 3rd grade (M = 8.10, range: 8.5–9.3), and 16 in 5th grade (M = 11.0, range: 10.5–11.4). The Spanish sample was composed of 561 monolinguals children attending five classrooms of three different elementary schools, in Spain: 141 at beginning of 2nd grade (Mean age: 7.4, range: 6.11–7.9), 130 at the end of 2nd grade (M = 7.10, range: 7.5–8.3), 141 in 3rd grade (M = 8.9, range: 8.5–9.4), and 149 in 5th grade (M = 10.10, range: 10.6–11.4).

The schools attended by the Spanish, as well as the French groups, taught reading using a phonic method. That is to say, the French participants had been presented with the alphabetic code from the beginning of first grade onwards. This may differ from teacher to teacher, but, in all cases, the basic principle was that the relation between letters and their corresponding phonemes was explicitly presented to the participants in a systematic manner from the very beginning of 1st grade.

In French, the overall reading level of each participant was assessed with a forced-choice sentence completion test (Lobrot, 1973), which consisted of 36 sentences with a missing word. Five alternatives were proposed in each trial and the children had to choose the correct one. For example: Si vous mangez ce gateau, dit ma mère, vous verrez comme il est… long, rond, bon, doux, chou (If you eat this cake, my mother says, you will see how it is … long, round, good, soft, cabbage). Similarly, the Spanish group was evaluated with TECLE (Test Colectivo de Eficacia Lectora—Collective Test of Reading Efficiency), a reading test inspired by the Lobrot-test (Marín & Carrillo, 1999), consisting of 64 sentences with a missing word. The reader had to choose the correct response from among four items: two words, one of them being the correct response, and two pseudo words; all of the foils were similar to the correct response at orthographic and phonological level. For example: Tengo un soldado tan pequeño como un … (I have a soldier as small as a …) guisante (correct word), guiarle (incorrect word), guifante (pseudo word), guisanle (pseudo word).

In both tests, participants completed as many sentences as they could in 5 min, the resulting score being the percentage of correct responses obtained in this fixed period of time. As the test progressed, the complexity of the task increased; words became less frequent and syntactic, pragmatic, and cognitive aspects of the sentences became more complex. The test provides a global evaluation of reading ability that includes both specific (efficient written word identification) and non-specific (general linguistic and cognitive knowledge) abilities. It is obvious that, in order to obtain high scores in this test, sufficiently high levels of efficiency are needed in all of these basic abilities.

Conditions and materials

The word-spelling task used for the French group consisted of 24 items. Six words, half high-frequency and half low-frequency, were selected for each IND item: /s/ → c and /z/ → z. In the same manner, three high-frequency and three low-frequency words were selected for each CCD item //e-i → gu and /n/p-b → m. In the Spanish group, the PGP /n/b → m was omitted to eliminate the possibility of the child spelling the phoneme /b/ using v instead of b (i.e., *canvio instead of cambio). In this case, the constraint “m before b” is not relevant. Twenty-four words were selected for the Spanish group—six words, (half high-frequency and half low-frequency) for each IND item, /b/ → v and /χ/e-i → g, and for each CCD item, /n/p → m and //e-i → gu. Series of frequent and infrequent words containing the target PGP were first chosen from Brulex (Content, Mousty, & Radeau, 1990) and from the Spanish Frequency Dictionary (Alameda & Cuetos, 1995) for French and Spanish items, respectively. Then, two additional frequency dictionaries based on school texts were consulted: Manulex (Lété, Sprenger-Charolles, & Colé, 2004) and the Diccionario de frecuencias del castellano escrito en niños de 6 a 12 años (Martínez & García, 2004). The complete list of Spanish and French words and their frequencies appears in Appendix 1 and 2, respectively.

Procedure

All the children were tested in groups in their classrooms, randomly mixing the items covering the different conditions. The experimenter read the target word aloud in the context of a sentence, then repeated it in isolation before the subjects wrote it down. The children were given an answer sheet on which the carrier sentence was printed with the target word missing and they had to fill in the blanks to complete the sentence. Only the graphemes corresponding to the target phonemes were considered in the analysis (i.e., if the word cigarette was used to test the /s/→ c item, only the letter(s) corresponding to the first phoneme was taken into account.

Results

The results will be presented in four parts. In the first section, the percentage of correct responses per language and grade obtained in conditions IND and CCD as a function of word frequency will be examined. These data appears in Table 1. In parts 2 and 3, separate analysis of the data obtained in CCD and IND conditions respectively, will be made. Tables 2 (CCD) and 3 (IND) represent the percentage of correct responses per language and grade for each PGP as a function of word frequency. In these separate CCD and IND analyses, individual reading scores were introduced as covariates in the corresponding ANCOVAs to provide specific insight into the spelling mechanisms beyond individual differences in reading ability, which were obviously strongly correlated with spelling. These ANCOVAs must be made separately in French and Spanish because reading scores do not have the same meaning in both languages. Indeed, nothing permitted us to consider that reading “n” sentences correctly in the French reading test (or “n %” of them) had the same meaning as reading the same number of sentences (or the same percentage) in the Spanish test. On the contrary, despite the fact that both tests had the same structure and capture the same underlying abilities, they differed in terms of sentence length, being shorter in Spanish (7.64 words, range: 4–16) than in French (11.75, range: 5–25). Finally, in the fourth part, intercorrelations between different aspects of spelling ability will be examined.

Table 1 Mean percent of correct responses and standard deviations per condition: CCD (consistent context dependent) and IND (inconsistent non dominant), frequency: HF (high frequency) and LF (low frequency), per language and grade (b = beginning: e = end)

General analysis

Table 1 represents the percentage of correct responses per language and grade obtained with IND and CCD items as a function of word frequency. Overall, performance was higher in Spanish than French (72.6 and 48.4 %, respectively), and the difference between languages was particularly marked at the beginning of the 2nd grade, when French results reached 10 % while Spanish children spelled correctly 55 % of the items. It is also apparent from Table 1 that the effects of consistency and word frequency interacted with grade and language. The differences between high and low frequency words and between CCD and IND conditions were smaller among the youngest children in the French than in the Spanish group.

A 2 × 4 × 2 × 2 (Language × Grade × Frequency × Consistency [CCD vs. IND]) ANOVA, with Frequency and Consistency as intra-subject factors, was run, taking the percentage of correct spellings as the dependent variable. A mixed-linear-model (not the classic repeated measures model) was adopted because it is well adapted to comparing the effects of Frequency and Consistency at each Grade level in French and Spanish, which was our main aim. All of the main effects were highly significant: F(1, 627) = 106.14, p < .001, ŋ 2 = .15, for Language, F(3, 627) = 81.55, p < .001, ŋ 2 = .28, for Grade, F(1, 1891) = 42.36, p < .001, ŋ 2 = .02, for Consistency and F(1, 1891) = 253.14, p < .001, ŋ 2 = .12, for Frequency. In addition, all of the two term interactions involving Language were significant: F(3, 627) = 13.57, p < .001, ŋ 2 = .05; F(1, 1891) = 19.60, p < .001, ŋ 2 = .01 and F(1, 1891) = 9.11, p = .003, ŋ 2 = .01, for Language by Grade, Language by Consistency and Language by Frequency, respectively. The difference between languages decreased with Grade (47.6, 27.7, 9.6, and 12.0 % from 2nd to 5th grade).

The Language by Frequency interaction deserves special attention. As just mentioned, it was highly significant, with the difference between high and low frequency words being greater in French than in Spanish (21.9 and 14.9 %, respectively). This difference in the size of the effect of Frequency arose, at least partially, from a ceiling phenomenon. The performance of the French group with low frequency words was considerably lower than that of the Spanish group (37.4 and 65.1 % correct, respectively). Obviously, when word frequency increased, it was easier to improve performance in French than in Spanish for purely mechanical reasons. However, if, rather than taking the absolute difference between high and low frequency words to evaluate the effect of frequency, the relative improvement was considered, the Spanish group showed stronger effects of Frequency at all grade levels. The relative values were calculated by considering differences between high and low frequency words (High–Low) divided by (100 − Low), which represents the maximum possible increase from the low frequency condition (Alegría & Mousty, 1996). The logic of this method assumes that climbing from 20 to 60 % correct responses, which is running half the total possible path, is equivalent to climbing from 60 to 80 % correct responses, despite the fact that in the first case the absolute difference is double that of the second case. The relative score was greater in the Spanish than in the French group (42.0 and 35.0 %, respectively).

The three-way interaction Language by Grade by Frequency in the ANOVA was also significant (F(3, 1891) = 3.83, p = .010), indicating that frequency effects in French were practically absent at the beginning of 2nd grade and increase rapidly from this point onwards, while in the Spanish group the effects of frequency were already present at the beginning of 2nd grade. The mean difference between high and low frequency words, in this grade, was 11.6 % in Spanish and 5.6 % in French (reaching, in relative terms, 22.8 and 5.8 %, respectively). The post hoc comparisons (Sidak) by pairs (high–low frequency words) per Grade and Language showed that all of the pairs differed significantly (p < .001) except the difference observed at the beginning of 2nd grade in the French group (F(1, 1891) = 1.61, p = .205). The corresponding value of the Spanish group at the same school level was F(1, 1891) = 54.75, p < .001. Interestingly, the effect of frequency in the French group became significant at the end of 2nd grade (F(1, 1891) = 36.22, p < .001), suggesting that the development of orthographic representations in French takes off during the 2nd grade period.

Let us return to the interaction Language by Consistency, which was, as reported, highly significant. This resulted from the presence of a systematic superiority of consistent items relative to inconsistent items in Spanish (78.9 and 66.3 %, respectively), while no difference was observed in French (49.7 and 47.1 %, respectively). The triple interaction Language by Grade by Consistency, however, was not significant (F < 1) indicating than the global situation just described remained unchanged across grades. It should be remembered that the CCD condition was introduced in the experiment in order to examine the acquisition of (contextually dependent) spelling items, which were identical in both languages. The Language by Consistency interaction suggested that spelling routines, like those involved in CCD condition were more thoroughly acquired and used in Spanish than in French. When the comparison between languages only took into consideration low frequency words, reducing the use of orthographic representation to a minimum, the difference between consistent and inconsistent words was greater. In the low frequency condition, performance climbed from 54.8 % with IND items to 75.5 % with consistent items in Spanish. The corresponding values in French were 31.4 and 43.5 % (in relative terms the difference between languages was even more impressive, 45.6 and 17.6 %, respectively).

Analysis of the CCD condition

The percentage of correct spellings per language and grade for each PGP (/n/p → m and //e-i → gu) as a function of word frequency is presented in Table 2. As in the general analysis, differences between languages were apparent, especially at the beginning of 2nd grade. The score of French children at this level was about 5 % correct, while Spanish children had already reached scores not far from 60 % correct. As mentioned above, the French and Spanish groups were analyzed separately in order to allow reading level to be included as covariate in the ANCOVAs. Hence, two 4 × 2 × 2 ANCOVAs (Grade × Frequency × PGP [/n/p → m vs. //e-i → gu]), with repeated measures in Frequency and PGP, were conducted.

Table 2 Mean percent of correct responses and standard deviations (SD) in condition CCD (consistent context dependent) for each PGP (phoneme-grapheme-pair), as a function of frequency: HF (high frequency) and LF (low frequency), per language and grade (b = beginning: e = end)

The Spanish group

All of the main effects were highly significant (p < .001). The main effect of the PGP (F(1, 556) = 27.94, p < .001, ŋ 2 = .05) came from the difference in favour of /n/p → m compared to //e-i → gu (85.1 and 72.9 %, respectively). This difference might be due to the difference in the incidence of these PGPs: /n/p → m is 2.6 times more frequent than //e-i → g in written Spanish. This question will be considered in the discussion.

The main effect of Frequency, which was highly significant (F(1, 556) = 45.01, p < .001, ŋ 2 = .08) might seem surprising. It must be remembered that CCD items were totally consistent, so that the presence of frequency effects indicated that, despite the availability of consistent spelling rules explicitly thought in the classroom, the orthographic lexicon contributed to spelling. Grade was highly significant (F(3, 556) = 13.38, p < .001, ŋ 2 = .07), despite the elimination of individual differences in reading ability, as were its interactions with Frequency and PGP (F(3, 556) = 3.91, p = .009, ŋ 2 = .02 and F(3, 556) = 3.00, p = .030, ŋ 2 = .02 respectively).

The French group

The main effects of Grade and PGP were highly significant in this group, too (F(3, 69) = 10.61, p < .001, ŋ 2 = .32 and F(I,69) = 7.87, p = .007, ŋ 2 = .10 respectively). Frequency, however, reached only a marginal level of significance (F(1, 69) = 3.82, p = .055, ŋ 2 = .05) despite a net difference between high and low frequency words (55.8 and 43.5 %, respectively). Furthermore, the Frequency by Grade interaction was not significant either (F(1, 69) = 1.13, p = .345). As was the case in Spanish, the effect of the PGP came from the difference in favor of /n/p → m compared to //e-i → gu (57.3 and 42.0 %, respectively), which might be attributed to the fact that /n/p-b → m is 4.45 times more frequent than //e-i → g in written French. The PGP by Grade was also significant (F(3, 69) = 3.36, p = .024, ŋ 2 = .13).

Analysis of the IND condition

The percentage of correct spellings per language and grade for each PGP (/b/ → v and /χ/e-i → g in Spanish, and /s/ → c and /z/ → z in French) as a function of word frequency are presented in Table 3. As in the CCD condition, the languages differed, especially at the beginning of 2nd grade. The mean score of the French children at this level was about 10 % correct spelling, while Spanish children reached scores not far from 50 % correct. The two groups were analyzed separately in order to permit the inclusion of reading level as covariate in the ANCOVAs. Hence, two 4 × 2 × 2 ANCOVAs (Grade × Frequency × PGP: /b/ → v and /χ/e-i → g in the Spanish ANCOVA, and /s/ → c and /z/ → z in the French) with repeated measures in Frequency and PGP, were conducted.

Table 3 Mean percent of correct responses and standard deviations (SD) in condition IND (inconsistent non dominant) for each PGP (phoneme-grapheme-pair), as a function of frequency: HF (high frequency) and LF (low frequency), per language and grade (b = beginning: e = end)

The Spanish group

The main effects of Grade and Frequency were highly significant (F(3, 556) = 6.67, p < .001, ŋ 2 = .03 and F(1, 556) = 63.56, p < .001, ŋ 2 = .10 respectively) but their interaction was not (F(3, 556) = 2.40, p = .067). This suggests that the effect of Frequency remains constant during primary school years but this is probably the result of a ceiling effect. The absolute effect of Frequency at 5th grade was about 25 %, which is similar to the corresponding values obtained at the other three school levels, while the relative Frequency effect was about 85 %, greater than the 20 % observed at the beginning of 2nd grade. Neither the main effect of PGP (F(1, 556) = 1.52, p = .219) nor its interactions with Grade and with Frequency (F(3, 556) = 2.33, p = .073 and F(1, 556) = 1.60, p = .207, respectively) were significant. This indicates that both PGPs considered, /b/ → v and /χ/e-i → g, behaved similarly in the face of frequency and schooling.

The French group

The main effects of Grade and Frequency were also highly significant in the French group (F(3, 69) = 6.19, p = .001, ŋ 2 = .21 and F(1, 69) = 15.04, p < .001, ŋ 2 = .18 respectively) but their interaction, contrary to what happened in Spanish, was also highly significant (F(3, 69) = 7.01, p < .001, ŋ 2 = .23). This agrees with the point established in the general analysis, indicating that in French the effect of Frequency was absent at the beginning of 2nd grade and increased rapidly with schooling. As was the case in the Spanish group, neither the main effect of PGP (F(1, 69) = 3.31, p = .073) nor its interactions with Grade and with Frequency, (F(3, 69) = 1.26, p = .297, and F < 1, respectively) were significant. This suggests that the PGPs considered in the French group, /s/ → c and /z/ → z, behaved similarly in the face of frequency and schooling.

Intercorrelations

Finally, individual differences in using of CCD routines to spell and in storing lexical representations of words were examined. The STH supposes that orthographic representations of words are elaborated via word reading, using grapheme-to-phoneme decoding mechanisms. It might be hypothesized that these two mechanisms are strongly related to each other. The use of orthographic representations of words to spell was evaluated in its purest possible manifestation in the experiment, that is, taking the difference between words of high and low frequency in the IND condition, in which phoneme-to-grapheme translation routines were useless. The effect of using these processes was assessed by the difference between consistent and inconsistent conditions in words of low frequency, that is, where the role of orthographic representations was minimal. The correlation reached the overall value of r = .504 (p < .001). Obviously, spelling and reading improve during primary school years and correlations between different elements of these abilities will show uninteresting correlations. Therefore, separate correlations by school level were calculated in order to reduce the role of general improvement. The grade-by-grade r-values were .424 at the beginning of 2nd grade, .419 at the end of 2nd, .579 in the 3rd, and finally .612 in the 5th grade (all p < .001). Causal relationships cannot be drawn from pure correlational data, but the results are consistent with the basic claim of the STH, which supposes that storing orthographic representations of words depends on phoneme-to-grapheme spelling processes.

Discussion

The main aim of the present study was to compare the acquisition of two basic spelling mechanisms of Spanish and French orthographic systems. The discussion is devoted to examining potential mechanisms underlying spelling acquisition in both languages, which represent contrasting systems as regard their level of transparency, and so add to recent efforts to understand basic reading mechanisms from a comparative viewpoint (Frith, Wimmer, & Landerl, 1998; Landerl & Reitsma, 2005; Sprenger-Charolles, 2003; Wimmer, 1993; Wimmer & Goswami, 1994).

Current theories of spelling acquisition consider that two essential, but not necessarily unique, aspects of this development are the acquisition of a sub-lexical processes which permits the translation of phonemes into graphemes, and an orthographic lexicon in which words are specified with all of their orthographic characteristics. Spanish and French orthographic systems differ considerably in terms of the productivity of their sub-lexical translating processes and so, the need to store lexical information differs too. To put the matter in its simplest form, we wondered about what processes intervene when a Spanish learner decides how to spell viento (wind) (v or b?), compared with a French learner of the same age choosing between s and c to spell cigarette. Although both learners are facing exactly the same local problem, their background systems are rather different. Both need the lexical representation of viento and cigarette if the words are to be correctly spelled. One of the questions was to examine the factors determining the development of the orthographic lexicon. A reasonable candidate was the degree of usefulness of this hypothetical structure. If usefulness was the main determinant, the French lexicon might be expected to “run” faster that the Spanish equivalent because French spellers must rely on it more often. However, the present study suggests that the opposite is the case.

The first question concerns the acquisition of two consistent PGPs, which are identical in Spanish and French. The phoneme // followed by /e/ or /i/ was chosen because, as explained in the Introduction, it is spelled gu without exception in both languages and shares the same competitor: g. The second case studied was /n/p-b → m, which is also identical in both languages. The results showed that, although the local properties of these PGPs are identical, Spanish-speaking children acquired them far more rapidly than their French-speaking counterparts, as shown by the difference between groups at the beginning of 2nd grade. The first conclusion is that it is the difference in the overall complexity of the two orthographic systems which explains the differential rate of acquisition of sophisticated (context-dependent) spelling processes. From any other point of view, it would be difficult to understand why French and Spanish differ in the use of PGPs, which are formally identical and possess the same productivity and identical competitors (Seymour, 2005; Seymour et al., 2003, for a discussion).Footnote 4 In fact, the number of PGPs to be integrated into the spelling system, as well as their productivity, favors Spanish learners. It has been argued before (Alegría & Mousty, 1996) that French children begin to spell using a simplified set of explicitly taught rules, which excludes contextually dependent cases. This explains why at the beginning of 2nd grade their score in the CCD condition was less than 10 % correct. At the same age, Spanish children spelled almost 60 % correctly, indicating that they possessed and were using complex PGPs considerably earlier. The suggestion proposed by Lété et al. (2008) that French spellers would be more sensitive to contextual constraints than Spanish speller is clearly not fulfilled by the present results. Their proposition was based on the notion that French spellers are more sensitive to syllabic than to phonemic units because syllables are more consistent than phonemes as regards word spelling. This notion is easy to handle in the theoretical context of grain-size-theory, which considers that the size of units used to read, and presumably also to spell, is greater in deep than in shallow systems (Ziegler & Goswami, 2005). Our results concerning the CCD items indicate that the complexity of the system as a whole determines the size of the units which are finally adopted to read and spell syllables and words. At the beginning of the searching process for the adequate units, however, shallow systems like Spanish are more rapid than deeper systems like French and English. This might explain why shallow systems store orthographic representations of words more rapidly than deeper systems (discussed below), even though these representations are less useful.

An intriguing feature shared by both languages was the presence of word frequency effects on the CCD items, /n/p-b → m and //e-i → gu, which unambiguously reveals the involvement of the orthographic lexicon, despite the availability of totally consistent PGPs. Moreover, in both languages, performance in /n/p-b → m was greater than in //e-i → gu. These results suggest the presence of complex interactions of lexical and sub-lexical processes in spelling which are worth discussing. The simplest conceivable model of word spelling supposes that lexical and sub-lexical sources of information are consulted, with the sub-lexical processes intervening solely when lexical information is not available. However, the mere presence of frequency effects with CCD items is incompatible with this notion. It is important to add that the CCD items used in the present experiment are explicitly taught in the classrooms as spelling rules like: “before p and b always spell m”. If they were systematically used, word frequency effect would not emerge. However, it does, even in the last years of primary school when the children could not ignore them. Anderson (1983, 1993) has proposed that explicitly taught rules are stored in memory in a declarative format which is not necessarily accessible to the speller. Practice progressively transforms declarative into procedural knowledge whose accessibility strongly depends on the difficulty of he task. This account can explain the difference between Spanish and French in condition CCD. Even if all of the participants knew the CCD rules, their application is more difficult in French than in Spanish because the former system is more complex tan the later.

Another aspect of the present results is worth examining. It is difficult to understand why the results obtained with //e-i → gu are poorer than those of /n/p-b → m, for frequent as well as for infrequent words, and in both languages. This finding suggests that there is something special about //e-i → gu, which might be the strong relationship existing between the phoneme // and the grapheme g an association that may at times hinder the PGP spelling process. Indeed, a primitive (non-contextual) association like // → g might override more sophisticated (contextually dependent) associations, as well as lexical information. In this model, the intervention of lexical, followed by sub-lexical, procedures, as supposed by the simplest model, plays no part. Spelling implies the inhibition of primitive associations which compete with more sophisticated ones. Such a mechanism corresponds to the notion of feed-backward introduced by Lété et al. (2008). In agreement with this idea, mention should be made of the results reported by Alegría & Mousty (1996) in French using another CCD item: /k/e-i → qu. Interestingly, this item is formally identical to //e-i → gu (a stop back) consonant, //—/k/, followed by e or i, spelled with an additional u, guqu). In this case, the level of correct spelling reached almost 100 %, and word frequency effects were totally overridden. The authors speculated that the difference between qu and gu was that the grapheme q represents exclusively the phoneme /k/, so that it was not submitted to a feed-backward interference phenomenon similar to that observed with //. It must be added that the contextually dependent rule /k/e-i → qu has exactly the same properties in Spanish as in French. Unpublished data collected by the authors of this paper show that the performance of Spanish children was almost perfect from the beginning of the 2nd grade (the youngest group examined) onwards.

The second aim of this study was to compare the development of the orthographic lexicon in Spanish and French. To this end, inconsistent non-dominant (IND) items were used because if a child spelled viento with v rather than b, the supposedly default choice, he/she could be credited with possessing an orthographic representation of this word.Footnote 5

The first point to highlight is that the orthographic lexicon develops more rapidly in Spanish than in French learners as shown by the presence of word frequency effects as early as the beginning of 2nd grade in Spanish children, while these effects where not apparent before the end of 2nd grade for the French children. This result suggests that the hypothesis that storing specific information about words depends on its usefulness is false. This hypothesis predicted a more rapid development of the orthographic lexicon in French than in Spanish, and this is not supported by the present data. It must be noted, however, that the absence of word frequency effects at the beginning of 2nd grade in the present study does not mean that these effects cannot be observed at this school level. In a dictation task using a carefully chosen set of words taken from the classroom books of the children, Martinet, Valdois, & Fayol (2004) found that inconsistent frequent words were better spelled than consistent words in French-speaking 1st graders. They also found analogical effects of well-known words in a pseudo-word spelling task. These results indicate that repeated exposure to words, progressively elaborates orthographic representations even in a deep orthographic system. The present results, as those reported by Martinet et al. (2004), fit the STH proposed in several papers by Share (1995, 1999, 2004). The basic feature of the STH is that each successful decoding of a word increases the probability that its orthographic representation will be stored. Storing lexical information is a passive phenomenon and not an active decision of the reader. The difference observed between languages might be attributed to the sub-lexical process, which is easier, because it is more consistent, in Spanish than in French. In this way, Spanish learners begin to identify words on their own more rapidly than French learners. Otherwise, the basic mechanism is identical in both languages. Share’s model was initially developed in Hebrew, whose script is even more regular than Spanish (when vowels are represented which is the case in primary school texts). However, the STH predictions have been contrasted with data collected in other systems, and have also been directly tested in English (Cunningham, 2006; Cunningham et al., 2002; Nation et al., 2007). The correlation observed in the present study between the contribution of sub-lexical processing and the orthographic lexicon can be understood in the conceptual frame of this model. Similar conclusions were drawn in a longitudinal study conducted in French, which showed that phonological processing was the bootstrapping mechanism for the acquisition of word-specific orthographic representations (Sprenger-Charolles, Siegel, Béchennec, & Serniclaes, 2004). It must be added in the present context that Nation et al. (2007) made a thorough examination of several predictions of Shares’ STH in 8- and 9-year-old English speaking children. The results confirmed the existence of a general relation between phonological decoding and orthographic learning. If, however, the prediction was examined by considering the correctly and incorrectly decoded items separately, the relation did not hold. Some items correctly decoded in the exposure phase of the experiment were not correctly recognized in the orthographic choice test, and some items were recognized correctly but had been read incorrectly in the exposure phase. Nation et al. (2007) concluded that their results do no not support a strong version of the STH. Although the present experiment was not conceived to deal with this question, the results show that, as expected in the context of the STH, the orthographic representation of words develops more rapidly in Spanish than in French. Obviously the present results cannot be considered as supporting the strong version of the STH, and finer inter-linguistic comparison will be necessary to draw conclusion in this area.

If it is admitted that word decoding is the basic mechanism for storing orthographic representations of words, even though it is not the only mechanism, it still remains to be explained why it is more efficient in shallow than in deep systems. The grain size theory proposed by J. C. Ziegler gives a clear account of the differences in reading acquisition in different languages (Ziegler & Goswami, 2005). The most interesting characteristic of this theory is that it does not consider the shallow-deep contrast between orthographic systems solely in terms of the consistency of translation rules, but also incorporates the specific features of the phonological system of the spoken language. It is argued that reading acquisition involves the mapping of orthographic units with the corresponding phonological units. A shallow system, like Spanish, is easier to learn than a deeper system, like French, because it is phonologically finer grained. Spanish has a clear syllabic structure and only five vocalic phonemes, which facilitates phonological awareness at phonemic level, and consequently the mapping between these units and the corresponding graphemes. French has 14 vowels, which are represented with the same set of letters as used in Spanish, which inevitably makes mapping more difficult. Besides, French morphology is often unsounded at phonological level but well represented in orthography (e.g., the plural of nouns, adjectives and verbs mentioned above), which makes difficult the learner’s task as shown by Fayol et al. (1999), Pacton and Fayol (2003) and Sénéchal et al. (2006). The differences in grain size between Spanish and French might explain the difference in the rate of reading and spelling acquisition.