Word reading is at the heart of the reading process, and serves as a key element in reading comprehension. For example, the simple view of reading (SVR) (Gough & Tunmer, 1986; Kirby & Savage, 2008) proposes that reading comprehension is the product of word reading and linguistic comprehension. Multiple theories have been put forth to explain how word reading develops. For example, Ehri’s (2005, 2018; Ehri and McCormick, 1998) phase theory describes five phases of word reading development: pre-alphabetic, partial alphabetic, full alphabetic, consolidated alphabetic, and automatic alphabetic; children go through this developmental sequence until word recognition becomes automatized as sight word reading. Morphology appears in the fourth (consolidated alphabetic) phase of this model, which allows for the accumulation of knowledge about the regularities in the writing system (e.g., grapho-phonemic and grapho-syllabic units), which, in turn, leads to growth in the mental lexicon and facilitates word reading ability.

Furthermore, connectionist models (Plaut, McClelland, Seidenberg, & Patterson, 1996; Seidenberg & McClelland, 1989) have distributed pathways that mediate the interaction of orthography, phonology, and semantics of lexical items. The connectionist models explain the dynamic cognitive processes of word reading in terms of weighted connections encoding different units (orthographic, phonologic, or semantic knowledge) in a cooperative and competitive manner (Plaut, 2005). Kirby and Bowers (2017, 2018) have suggested that morphology binds phonology, orthography, and semantics because morphemes contain cues to pronunciation, spelling, and meaning.

Predictors of word reading

Skilled reading in English and other languages relies on a constellation of linguistic and cognitive processes including phonological awareness (PA), naming speed (NS), orthographic processing (OP), and morphological awareness (MA), and each of these cognitive processes focuses on different aspects of the reading process (Arabic: Abu-Rabia, Share, & Mansour, 2003; Tibi & Kirby, 2018a, 2019; Chinese: McBride-Chang et al., 2005; Cross-linguistic: Caravolas, Lervåg, Defior, Málková, & Hulme, 2013; Landerl et al., 2019; English: Carlisle, 2000; Castles, Rastle, & Nation, 2018; National Reading Panel, 2000; Roman, Kirby, Parrila, Wade-Woolley, & Deacon, 2009; Finnish: Georgiou, Papadopoulos, Fella, & Parrila, 2012; Silvén, Poskiparta, Niemi, & Voeten, 2007; German: Landerl & Wimmer, 2000; Greek: Georgiou, Torppa, Manolitsis, Lyytinen, & Parrila, 2012; Hebrew: Ravid & Malenky, 2001; Schiff, Raveh, & Fighel, 2011; Japanese: Muroya et al., 2017; Turkish: Babayiğit & Stainthorp, 2007). Empirical evidence has shown that variability in word reading among individuals (skilled readers vs. dyslexics or poor readers) is related to differences in these cognitive processes. For example, skilled readers perform better at grapheme-phoneme decoding, phonemic awareness, and naming speed (Compton, DeFries, & Olson, 2001; Kirby, Parrila, & Pfeiffer, 2003; Landerl et al., 2013, 2019; Parrila & Protopapas, 2018; Wolf & Bowers, 2000). Most current research on word reading has focused on child-level predictors (e.g., decoding abilities, PA, NS, knowledge of words’ specific meaning, and MA), rather than word-level predictors such as word regularity, morphological transparency, orthographic frequency, and number of letters (Gilbert, Compton, & Kearns, 2011; Kearns, 2015; Kearns & Al Ghanem, 2019; Kearns et al., 2016). Adding word-level predictors offers the potential to account for variance not accounted for by child-level predictors. For example, Kearns (2015) and Kearns and Al Ghanem (2019) showed that there were significant separate child and word sources of variance (log-odds > 1.00 in an unconditional model). When Kearns and colleagues examined word and child characteristics in the same models, they found significant effects for both word- and child-levels.

Therefore, it is important to acknowledge that accurate reading depends on both person and word factors (Gilbert et al., 2011; Kearns, 2015; Kearns et al., 2016; Steacy, Elleman, Lovett, & Compton, 2016; Steacy et al., 2018). There is growing evidence showing various characteristics of English words contribute to variability in reading. For example, words that contain vowel diagraphs (Gilbert et al., 2011) or consonant clusters (Olson, Forsberg, Wise, & Rack, 1994) are more difficult to read than those that do not.

Orthographic regularity is another word feature that has been shown to affect recognition of low frequency words (Balota & Ferraro, 1993; Waters, Seidenberg, & Bruck, 1984). Regularity describes the level to which words adhere to grapheme-phoneme correspondence (GPC) rules; regularity varies across orthographies, from very regular ones such as Finnish to more opaque ones such as English (Seymour, Aro, & Erskine, 2003). Vowelized Arabic orthography, in spite of its high orthographic regularity, may have other features (allography and nonlinear morphology) that could contribute to word reading difficulty (Tibi & Kirby, 2018a).

Semantic characteristics of words such as concreteness and imageability have also been shown to affect word reading accuracy (Coltheart, Laxon, & Keating, 1988; Keenan, Betjemann, & Olson 2008; for a contrary view, see Duff & Hulme, 2012). For example, performance on word reading is slower when reading irregular words with low imageability (Strain & Herdman, 1999). There is also evidence from connectionist modeling that the addition of a semantic processor improves both nonword and irregular word reading (Plaut et al., 1996).

Another word-specific feature that has been found to affect reading accuracy is length. Longer words, as measured by number of letters (De Luca, Barca, Burani, & Zoccolotti, 2008) or polysyllabic (Muncer & Knight, 2012) or polysyllabic and polymorphemic words (Kearns, 2015) are more challenging to read. Steacy, Elleman, and Compton (2017) argued that such words require readers to employ inductive learning to build orthographic and semantic knowledge taking into account the statistical probabilities of word components. Kearns et al. (2016) investigated item-specific, child-level, and word-level knowledge among fifth graders with either early- or late-emerging reading difficulties who were asked to read polymorphemic words. They found that at the word level, word frequency and root word family frequency were significant predictors.

Previous research has shown that high frequency words are easier or faster to read than low frequency words (Jared & Seidenberg, 1990; Treiman, Goswami, & Bruck, 1990; Yap & Balota, 2009). Frequency has been assessed with respect to different units. For example, some research reports frequency as a count of phonological units such as the final vowel-consonant unit (the rime unit) in monosyllabic words (Booth & Perfetti, 2002; Treiman et al., 1990), whereas others (Calhoon & Leslie, 2002) reported word frequency and rime-neighborhood size as factors.

A type of frequency related to morphology is “family size” (Baayen, Lieber, & Schreuder, 1997; Schreuder & Baayen, 1997), which refers to the frequency or type count of the number of words that share the same root. The more words are derived or formed from a root, the more productive the root is, and words from larger families are read more accurately (Carlisle & Katz, 2006; Carlisle & Stone, 2005; Deacon, Whalen, & Kirby, 2011; Kearns et al., 2016).

The distributional properties of words, morphemes, and rimes have been shown to affect word and pseudoword recognition (accuracy and latency) in lexical decision tasks across several languages. For example, Dutch productive noun stems have been shown to elicit higher and faster frequency ratings than less productive noun stems with smaller family sizes (Baayen et al., 1997). Family size of the root (i.e., the number of words sharing a base morpheme) has been shown to predict accuracy and latency of word recognition tasks in Hebrew (Moscoso Del Prado Martin et al., 2005), a language with nonlinear morphology. In developmental studies on Hebrew, Schiff et al. (2011) investigated the effect of morphology on priming words among 4th and 7th graders. Their findings revealed that, with young readers, priming occurred when the prime and target words were morphologically and semantically related. However, among the more skilled readers, strong morphological priming occurred regardless of semantic connection. Ravid (2003) has also provided evidence on the prominent role of the root in the lexical acquisition of Hebrew-speaking young children. Arabic, a language similar to Hebrew in its morphological structure that combines roots with word patterns nonlinearly, has also shown an effect of root productivity in lexical decision tasks (Boudelaa & Marslen-Wilson, 2011). Interestingly, Boudelaa and Marslen-Wilson found that priming occurred only when the consonantal roots were productive, regardless of the productivity of the word pattern. Accordingly, they concluded that the distributional property of the Arabic root dominates the lexical access process in Arabic.

The distinctive characteristics of Arabic orthography and morphology call for analysis of the effects of word characteristics, especially in children learning to read. Findings from research on English and other orthographies may not be generalizable to Arabic (see Share, 2008 on the Anglocentricity of research on reading) due to the fact that Arabic morphology and orthography have distinct features from other languages (Tibi & Kirby, 2017, 2018a, 2019; Wofford & Tibi, 2018). Before we describe the current study, we provide a brief description of the Arabic language and its orthography.

Arabic language and orthography

Arabic, like Hebrew, is a Semitic language in its typology (Owens, 2013) and is read right to left. Both are characterized by nonlinear morphology and the primacy of consonantal roots, placing them in stark contrast to Indo-European languages.

Vowelization

Arabic orthography is comprised of 28 letters, including three long vowels (/a:/, /u:/, and /i:/) that are represented orthographically as letters, whereas the three corresponding short vowels (a, i, u) are represented as diacritic marks (e.g., ـَ, ـِ, ـُ, ـْ) and may or may not be added to the words, depending on the type of text; for example, these diacritics are mostly seen in children’s books and religious texts, but are not present in adult novels or newspapers. Although these short vowels are not represented orthographically in all types of texts, their role is quite significant because they provide phonemic and morpho-syntactic information. Abu-Rabia (1996, 1997, 1999, 2007) emphasized that the presence of short vowels enhances reading accuracy for all readers and across all ages and reading skills. Short vowels contribute to the transparency of Arabic orthography allowing for close grapheme-phoneme correspondences. Abu-Rabia (2007) maintained that both beginning and advanced Arabic readers rely on short vowelization for correct pronunciation. He also argued that because of homography and the role of vowelization of words’ final letters in determining grammar, word recognition is highly dependent on vowelization. Abu-Rabia (2007) also reported that roots of words and short vowelization were essential factors for reading accuracy in highly skilled adult Arabic readers.

Ligaturing

Arabic orthography is also characterized by ligaturing (connectivity), which requires certain letters (22 out of 28) to be connected to other letters. The process of ligaturing changes the shape of the letters (termed allographs), depending on their location in the word (isolated, initial, medial, or final), and the change for some letters is substantial; for example, the letter ‘ه’, roughly equivalent to “h,” has the following different shapes: Ligaturing contributes to orthographic depth (Share & Daniels, 2015; Tibi & Kirby, 2018a, 2019). Several researchers (Khateb, Taha, Elias, & Ibrahim, 2013; Taha, Ibrahim, & Khateb, 2013) have found that connected words (due to compulsory ligaturing) were easier to process than non-connected words for skilled readers and dyslexics. Taha and Khateeb (2018) showed that first graders were more likely to accept ligatured words than unconnected words as real words and that kindergarteners similarly accepted ligatured pseudowords; they argued that this was due to the frequent occurrence of connected words in Arabic. Dai, Ibrahim, and Share (2013) found that grade 3 children read ligatured pseudowords more slowly than non-ligatured pseudowords, but in a second experiment, diacritics had even more effect. In the end, they concluded that “ligatures do not create any special burden for Arabic readers” (p. 204).

Morphology

A third characteristic of Arabic lexical structure is its morphology, both linear and nonlinear (Boudelaa & Marslen-Wilson, 2001; Tibi & Kirby, 2017, 2018a, 2019). In Arabic linear morphology (used for inflections), morphemes are added to words as prefixes and/or suffixes (as in English) and provide information about the grammatical functions of a word such as gender, number (singular, dual, and plural), person, and time. for example, /rassaaam/ “painter” +/a:n/ “dual inflectional marker” = /rassa:ma:n/ which means “two painters.” In contrast, in nonlinear morphology (used for derivations), words are formed by interleaving two abstract bound morphemes: the consonantal root (providing meaning) and the mostly vocalic word pattern (providing morphosyntactic functions). For example, from the consonantal root /r.s.m./, we can derive /rasama/ “to draw,” /marsam/ “studio/atelier,” /rassa:m/ “painter/artist,” /marsu:m/ “was drawn/painted,” and many more. In the preceding examples, /a.a.a/, /ma..a/, /a.aa./, and /ma..u./ constitute the word patterns. Because of this compulsory nonlinear process in Arabic word derivation, the vast majority of Arabic words are at least bimorphemic (Tibi & Kirby, 2017). It is important to note that the root and the word pattern are both bound morphemes, and only when they are combined is a meaningful word created. Also, because ligaturing in Arabic orthography allows for affixes and cliticsFootnote 1 to be connected to derived words, some Arabic words may contain the conjunction “and” as a clitic, and subject, verb, and object (e.g., /fajarsumu:hʊm/ “and they draw/paint them”). Parsing such a one-word sentence into its constituents (/fa/ “and,” /jarsum/ “paint/draw,” /u:/ “suffix indicating ‘they’” as the masculine plural, and /hum/ them”) is probably a challenging task for beginning Arabic learners, although we are not aware that this has been demonstrated empirically. We know though that root awareness (RA) plays a central role in multiple reading outcomes (word reading accuracy, pseudoword reading accuracy, word reading fluency, text reading fluency, and maze reading comprehension) among Arabic-speaking third graders (Tibi, Tock, & Kirby, 2019). Using structural equation modeling, Tibi et al. (2019) reported that RA accounted for 43.30% of the variance in reading achievement as the latent variable, after accounting for vocabulary. In a number of studies, Abu-Rabia (2007), Abu-Rabia and Abu-Rahmoun (2012), Abu-Rabia et al. (2003) have found that (a) all readers (adults and normal and dyslexic children) performed better on vowelized measures as compared with unvowelized measures; (b) made use of the information provided by the root when reading pseudowords with real roots; and (c) the dyslexic readers tend to rely on roots and short vowelization in reading words whereas normal readers tend to rely on roots only if words are unvowelized. Abu-Rabia and Abu-Rahmoun explained that typically-achieving readers have richer morphological lexicons, whereas dyslexic readers utilize root identification as a compensatory strategy to make up for their poor phonological decoding skills. Finally, Boudelaa and Marslen-Wilson (2005, 2011) demonstrated that priming among adult students was determined entirely by the root, not the word pattern, irrespective of semantic transparency or root productivity. Boudelaa and Marslen-Wilson (2011) concluded that “the basic processes of morphemic segmentation in Arabic are organized around the root” (p. 641).

These characteristics of Arabic language and orthography (transparency of vowelized Arabic, ligaturing, and complex morphology) invite the question: How might these characteristics of Arabic orthography affect word reading? In spite of the transparency of vowelized Arabic orthography, there are good reasons to regard Arabic as having orthographic depth (Share & Daniels, 2015; Tibi & Kirby, 2018a). First, the graphemes themselves are more complex due to allography and ligaturing. Second, some Arabic orthographic units (e.g., roots and derivational morphemes) may be more difficult to learn because they do not occur contiguously (because of the interweaving of roots and word patterns). These features of Arabic suggest that a number of word-level characteristics may contribute to word reading difficulties, in particular the number of morphemes in the words. They also suggest that individual differences in morphological awareness (Carlisle & Katz, 2006; Carlisle & Stone, 2005; Kearns et al., 2016) will be important.

Although there is a growing research literature on the predictors of Arabic reading (for review, see Tibi & Kirby, 2019), no Arabic research has addressed the multiple word-level features in crossed-random effects models. To paint a more complete picture, we also need to examine the characteristics of words that make reading more difficult for beginning readers.

The current study

The aim of the current study was to investigate the effects of word-level characteristics on Arabic word reading using Cross-Classified Generalized Random-Effects (CCGRE) analysis. CCGRE allows for the simultaneous estimation of multiple word-level predictors in addition to person-level predictors. Although prior research on Arabic has investigated the predictors of Arabic word reading (e.g., Abu Ahmad, Ibrahim, & Share, 2014; Abu-Rabia et al., 2003; Asaad & Eviatar, 2014; Asadi & Khateb, 2017; Asadi, Khateb, Ibrahim, & Taha, 2017; Layes, Lalonde, & Rebaï, 2017; Taibah & Haynes, 2011; Tibi & Kirby, 2017, 2018a, 2019), none of the existing studies has explored multiple word factors in one model using CCGRE. Utilizing CCGRE allows for random effects models in which word characteristics and person factors are investigated simultaneously. The current study investigated the effects of number of letters, number of syllables, number of morphemes, ligaturing, orthographic frequency, root type frequency, concreteness, and part of speech as word-level characteristics and morphological awareness (MA) as a person-level predictor of word reading accuracy. We hypothesized, based on the research in other languages reviewed earlier (e.g., Carlisle & Katz, 2006; Carlisle & Stone, 2005; Kearns, 2015; Kearns et al., 2016), that each of these word characteristics (except for ligaturing) will have a negative effect on Arabic word reading. More specifically and based on the predictive validity of MA in reading Arabic (Tibi & Kirby, 2017, 2019) as well as the primacy of the root in children’s Arabic reading (Tibi & Kirby, 2019; Tibi et al., 2019; Abu-Rabia & Abu-Rahmoun, 2012) and adults’ lexical word recognition (Abu-Rabia, 2007; Boudelaa & Marslen-Wilson, 2001, 2004, 2005), the productivity of the root in priming (Boudelaa & Marslen-Wilson, 2011), and the importance of MA in Arabic reading (e.g., Abu-Rabia & Abu-Rahmoun, 2012; Tibi & Kirby, 2017, 2019; Tibi et al., 2019), we hypothesized that the morphological variables (number of morphemes, root type frequency, and, at the person level, MA) will predict reading accuracy. Following Taha and Khateeb (2018) and Taha et al. (2013), we did not expect ligaturing to influence word reading accuracy.

Method

Participants

The participants were 303 grade three children from two studies conducted by Tibi and Kirby (2017, 2018a). The first sample included 102 Arabic-speaking children (51 male; mean age = 104 months, SD = 5.7 months) who came from a broad range of public female and male schools in Abu-Dhabi, representing a range of socioeconomic backgrounds (Tibi & Kirby, 2017), as described by Abu-Dhabi Council of Education officials. The second sample comprised 201 students (101 male; mean age = 97 months, SD = 5.4) who were randomly selected from a group of public schools provided by the Ministry of Education in Dubai (Tibi & Kirby, 2018a). The criteria for inclusion in both samples were informed parental consent and being native Arabic-speaking Emirati children with both parents as native speakers of Arabic. None of the participants showed signs of hearing, visual, or language impairment.

Word reading measure

A set of 80 fully vowelized words (see Appendix) was administered to both groups of participants as part of a battery of tests (Tibi & Kirby, 2017, 2018a), with scores ranging from 0 to 80 words correct (mean = 49.61, SD = 23.23). All words were shown with all vowels. We did this (a) to increase the chances of accurate reading because the participants may not have mastered reading yet and may still be relying on decoding (Tibi & Kirby, 2018a), (b) to eliminate the homographic effect, that is, the possibility of multiple accurate readings of some words (e.g., some verbs could be read either in the passive or active form if not vowelized, e.g., /jastaxdim/ “to use” or /justaxdam/ “was used,” and (c) because this is the form most familiar to children. Words were presented on a laminated A-4 paper, in a series of rows with four words in each row.

The words were selected based on the frequency count from two sources each representing a different corpus. One source was the lexical database of Modern Standard Arabic known as Aralex (Boudelaa & Marslen-Wilson, 2010). Aralex consists of 40 million word tokens primarily drawn from unvowelized Arabic newspapers available online and includes information about orthographic forms, vowelized stems, unvowelized stems, roots, and word patterns. Aralex also provides statistical information about words and morphemes. The second source was a corpus of 147,527 word tokens compiled from Arabic textbooks used in primary schools (Grades 1–6) from two different Arab countries (Libya and UAE) by Belkhouche, Harmain, Al Najjar, Taha, and Tibi (2010). The following is a brief description of each of the word characteristics (see also Table 1).

Table 1 Description of word level predictors

Orthographic frequency

Words on the reading test ranged in orthographic frequency from most frequent (the preposition /fi:/ (in) with a frequency of 32,189.29 in Aralex and 4100 in the textbook corpus) to the lowest frequency of .03 in Aralex (and 1 in the textbook corpus) for the one-word phrase /ʕazi:matuka/ (your determination/will). We used the Aralex frequencies divided by 1000 to keep the predictors on similar scales. The frequencies indicate the number of times each distinct orthographic form occurs in the 40-million-word corpus. Boudelaa and Marslen-Wilson (2010) defined orthographic form as “the graphic entity that occurs with white space on either side of it” (p. 484). There were two words for which Aralex did not provide orthographic frequencies, although it provided their respective root type frequencies. For these two words, we used ArabiCorpus (Parkinson, n.d.), which is based on adult corpora and provides the frequency for the surface form only. Because ArabiCorpus provides the frequency per 100,000 words, these were multiplied by 10.

Word length

The numbers of letters and syllables in each word were counted. The words varied in their number of letters, ranging from 2 (“in” ‘في’) to 10 letters (“the hospitals” ). The mean number of letters was 5.76 (SD = 1.66). The words also varied in their number of syllables from one to six syllables with a mean of 3.49 (SD = 1.07).

Number of morphemes

Words ranged in number of morphemes from one (e.g., the pronoun “he” or the preposition “in”) to five morphemes (e.g., “the contestants” /ʔalmutasa:bequ:n/) with a mean of 2.86 (SD = 1.14). Each word was given a score based on the number of morphemes (derivational, inflectional, and clitics) in it and the number of morphological transformations the word had undergone. For example, the word “the book” /ʔal-kita:b/ received 3 points: 1 point for its root [k.t.b], another point for being transformed into a noun, and another point for the particle “the” /ʔal/. Another example of a polymorphemic word that underwent multiple morphological transformations and included an inflectional morpheme is “the contestants” /ʔalmutasa:biqu:n/ which received a score of 5 as follows: 1 point for the root /s.b.q/, another point for the derived verb /tasa:baqa/ , a third point for the agentive noun /mutasa:biq/ that is derived from the verb, a fourth point for the inflectional suffix /oun/ to indicate plural masculine added to the noun /mutasa:bqu:n/ and a fifth point for the addition of the definite article “the” /ʔal/.

Ligaturing

Words varied in the number of ligatured letters, ranging from one (“he” هُوَ) to seven (e.g., “the hospitals” with a mean of 3.09 (SD = 1.42).

Root type frequency

According to Boudelaa and Marslen-Wilson (2010) root type frequencies are “raw counts and are extracted from the dictionary. Specifically, the type frequency (or the morphological family size) for a particular root is the number of stems containing that particular root” (p. 484). Roots of the words in the current study spanned a range of frequencies from a minimum of one /ʕankabu:t/) to a maximum of 50 (the root for the word “circle,” i.e., to form a circle or a round shape), with a mean of 16.35 (SD = 10.24). It should be noted that 5 out of 80 words did not have a root type frequency in Aralex. Accordingly, each was assigned as value of 1, as recommended by Boudelaa (personal communication, January 23, 2019).

Concreteness

Words were assessed as either concrete or abstract. Of the 80 words, 41 (51%) were concrete words (e.g., vegetables, hospitals, spider, bed) and 39 (49%) abstract (what, was, private, contemplation, remember). Words deemed concrete are the ones that are mentally more imageable (Paivio, Yuille, & Madigan, 1968).

Part of speech

The words covered a range of different parts of speech: 45 nouns (56%), 23 verbs (29%), 4 adjectives (5%), 4 personal pronouns (5%), and 4 other words including 1 preposition (1.25%), two relative pronouns (2.5%), and 1 question word (1.25%). A group of contrasts between adjectives and the other categories was used to assess this with adjectives as the reference group; thus, the contrasts were nouns vs. adjectives, verbs vs. adjectives, etc.

Morphological awareness

The person-level characteristic MA was assessed using the sentence completion task (Tibi & Kirby, 2017), which was based on Carlisle’s (2000) measure. Participants were presented with 20 written sentences each preceded by a clue word, and they were asked to complete each sentence in writing with the correct form of the word to complete each sentence. The participant’s score was the number of items correct. Scores ranged from 0 to 20 with a mean of 13.37 (SD = 4.21). Some of the items required inflections and others involved derivations (e.g ).

Procedure

Both the word reading and MA tests were administered by the first author who is a native speaker of Arabic. Participants were instructed to read the words on the word reading test as accurately as possible. The test was administered individually and was not timed. Testing began with three practice items and was discontinued after seven consecutive errors to spare children repeated failure. The termination rule was justified because the reading test was previously piloted (Tibi & Kirby, 2017) and deemed to be ordered from easy to difficult. The MA test was preceded by four practice items to ensure children’s understanding of the task. Instruction was provided in the Emirati dialect, which was the children’s spoken dialect.

Results

Descriptive statistics for all item-level predictors, MA score, and the word reading measure are provided in Table 1. The number of correct words read by the participants ranged from 0 to 79 with a mean of 49.61 (SD = 23.23). The correlation between MA and word reading was 0.63. The correlations among the word-level predictors appear in Table 2. While there were many nonsignificant relations among the predictors, a strong subset of correlations were observed for ligaturing, number of letters, number of syllables, and number of morphemes with the correlations among these measures ranging above .70.

Table 2 Correlations

Cross-classified generalized random-effects models

Item responses for all 80 words were modeled using a series of cross-classified generalized random-effects (CCGRE) models. These models allow for the estimation of variability in item responses between students as well as the variance between words. Because all the responses were binary (correct/incorrect) we used a binary distribution with a logit link function to predict the probability of getting an item correct based upon the set of item characteristics. In these models, persons are crossed with items and both persons and items were allowed to be random factors. For the conditional models, one person level predictor (morphological awareness) and eight item level measures (number of letters, number of syllables, number of morphemes, amount of ligaturing, orthographic frequency, root type frequency, part of speech (whether the word was a noun, pronoun, verb, adjective, or other), and concreteness) were included as predictors. To estimate the zero-order relations of these predictors to the probability of getting an item correct, we fit a separate model for each predictor; to investigate which predictors were uniquely significant after controlling for the other predictors in the model, we fit one model with all the predictors entered simultaneously.

The first CCGRE model fit was an unconditional model (see Table 3). This model estimates the variance attributable to students, the variance attributable to words, and a grand mean (intercept) that yields the probability of getting an item correct (in logits). The results of this model revealed the variability due to students (SD = 2.34) and the variability due to items (SD = 1.76), indicating that there is enough variability in item responses to try and predict this variability with our predictors.

Table 3 Unconditional model

The next set of CCGRE models estimated the effect of each predictor alone to predict the odds of getting an item correct. For MA, number of letters, syllables, ligatures, and morphemes, we grand-centered the predictor so that the intercept could be interpreted as the odds of getting an item correct when the score on the predictor was average in the sample. For the other predictors, we dummy-coded the predictors and left them uncentered because they have a natural zero. The intercept then becomes the estimate for the referent category. The results of these models are in Table 4. Because we were fitting nine models (with a total of 12 parameters because of the 5 parts of speech), we controlled for Type I error using the Benjamini-Hochberg linear step up procedure (Benjamini & Hochberg, 1995). Ignoring the p values for the intercepts, there were 12 p values we needed Type I error control over. After performing the linear step-up procedure, the critical p value was determined to be .038, and therefore only MA, number of letters, number of syllables, number of morphemes, ligaturing, orthographic frequency, the difference in probability between pronouns and adjectives, nouns and adjectives, and adjectives and other were statistically significant.

Table 4 Results from the nine separate models predicting word reading

The final CCGRE model entered the predictors simultaneously into the model, and the parameter estimates are reported in Table 5. After conducting the linear step-up procedure (critical p value = 0.008), only number of morphemes and MA remained significant. The odds ratio for number of morphemes was .45, meaning that for every increase in the number of morphemes, the odds of reading this word correctly dropped by more than half.

Table 5 Results from the simultaneous prediction model of word reading

Discussion

The purpose of the current study was to investigate the effects of word-level characteristics and individual differences in morphological awareness on word reading accuracy. We used a novel statistical method in Arabic research, cross-classified generalized random-effects modeling, which allowed the effects of person and word characteristics on word reading to be estimated simultaneously. Of the word characteristics hypothesized to be related to word reading accuracy, all but root type frequency and semantics were significant (Table 4). Contrary to what would be expected based on Taha and Khateeb (2018) and Taha et al. (2013), the number of ligatures was negatively related to word reading. This negative effect may indicate that words with more ligatures tend to also have other features (length, infrequency, etc.) that are related to word reading (see Table 1). When these other features are controlled in the combined model (Table 5), the ligaturing effect becomes positive but nonsignificant.

When all features were combined in one model, the number of morphemes and morphological awareness were the only significant predictors (Table 5). This result partially confirms our hypothesis that the morphological variables (MA, number of morphemes, and root type frequency) would be important predictors and supports the expectation that the morphological complexity of Arabic words presents a serious obstacle for children learning to read. These findings corroborate previous evidence from Arabic regarding person-level predictors (Abu-Rabia & Abu-Rahmoun, 2012; Tibi 2016; Tibi & Kirby, 2017, 2018a, 2019; Tibi et al., 2019) and are consistent with much research in English and other orthographies (e.g., Carlisle & Katz, 2006; Carlisle & Stone, 2005; Gilbert et al., 2011; Kearns, 2015; Kearns et al., 2016; Steacy et al., 2016; Steacy et al., 2018; Verhoeven & Perfetti, 2011). On the other hand, this result raises the question why root type frequency, a measure of morphological family size, failed to reach significance in both the individual and full models.

It was surprising that root type frequency did not show a significant effect, given what previous studies have found (Boudelaa & Marslen-Wilson, 2011; Carlisle & Katz, 2006; Deacon et al., 2011; Kearns et al., 2016). Several possible explanations can be suggested. First, the participants in the current study were much younger than the university student participants in the studies by Boudelaa and Marslen-Wilson (2011) who would also have more reading experience and more extensive lexicons. The current participants may not have had enough reading experience to have developed mental representations of the roots that were robust enough to help them in reading. These children may still be working at the grapheme-phoneme decoding phase, not having yet developed “consolidated” representations (Ehri, 2005, 2018). This may be particularly true for Arabic in which the root consonants are separated by graphemes from the word patterns (e.g., [d.r.s] “to study” “schools”). It may take many more exposures to identify a root in multiple words to develop a solid mental representation when the elements of the root are dispersed rather than contiguous (as they are in English and other alphabetic orthographies). We would expect that as children increase in reading experience, their mental representations of roots will become better established, and that characteristics such as root type and root token frequency will contribute to their successful word reading.

A second explanation is that the Aralex root type frequencies are based on dictionary listings (i.e., the number of words in the dictionary that include each root) and therefore may include many words that children would never have encountered. We had considered using Aralex’s root token frequencies (i.e., how frequently each root appeared in the Aralex corpus) but could not do this due to its collinearity with orthographic frequency. The fact that orthographic frequency was a significant predictor in the single-predictor model suggests root token frequency might also have been, but the nonsignificance of orthographic frequency in the full model suggests that this would also have happened to root token frequency. Future research should employ frequencies derived from a corpus based on words children are likely to have seen, for instance those from school textbooks. In addition, including unvowelized words in the word reading measure might change the results in favor of root type frequency. When words are unvowelized, the young reader might resort more to the root extraction process (Boudelaa & Marslen-Wilson, 2005, 2011, 2015) and to identification of larger groups of letters (Ehri, 2005, 2018).

Last, but certainly not least, the absence of the root type frequency effect could be a consequence of instructional practices. It may be the result of a lack of explicit instruction in morphological awareness and morphemic analysis of multimorphemic words and a lack of practice in deriving multiple words from the same root. Alternatively, children may be taught to recognize polymorphemic words as whole words or to sound them out as sequences of phonemes rather than as structures composed of morphemes. Although phonological processing is important in Arabic (Tibi & Kirby, 2018a), it needs to be combined with morphological processing to deal with polymorphemic words (a comparable argument can be made in English, e.g., Kirby & Bowers, 2017, 2018). In the absence of direct observations of instructional practices, however, these instructional interpretations remain speculative.

Implications, limitations, and future directions

The main implication of this research for practice is that children learning to read Arabic need to be taught to recognize and manipulate morphemes in the words they are reading and that particular attention needs to be given to longer, multimorphemic words. We already know that morphological awareness is a predictor of success in reading Arabic (Tibi & Kirby, 2017; Tibi et al., 2019) and that it survives the control of phonological awareness (Tibi & Kirby, 2019). Even if phonological awareness is the stronger predictor during learning to read (Tibi & Kirby, 2019), morphological awareness adds valuable variance and may provide access to decoding for children whose phonological awareness is weak, such as those with phonological dyslexia (Elbro & Arnback, 1996). Morphological processing would allow children to work with larger orthographic units, a necessary step towards fluency. As we noted earlier, Arabic morphology is more complex than that of English, so teaching practices and materials that work in English (e.g., Bowers, Kirby, & Deacon, 2010; Goodwin & Ahn, 2010, 2013; Kirby & Bowers, 2017, 2018) will need to be adapted. Phonologically based instruction can be integrated with morphologically based instruction. There is a need for curriculum development and instructional research in Arabic.

The main implication for future research is that more word-level analyses are required in Arabic, and this research should begin to jointly consider further child-level factors, such as phonological awareness, naming speed, specific semantic knowledge of words, and vocabulary. Moreover, further analyses involving orthographic and phonological shifts are needed in Arabic. For example, in the words “asked” /saʔala/ and “question” /suʔa:l/, there are orthographic and phonological shifts by virtue of the different word patterns and the spelling convention (the middle consonant-glottal stop changing its shape due to the /u/ vowel preceding it). We know that shift words are more difficult to read in English than transparent words especially among young children (Carlisle, 2000; Carlisle & Stone, 2005; Kearns, 2015). As noted earlier, research in English has found significant interactions between word- and child-level factors (e.g., Gilbert et al., 2011; Kearns, 2015; Kearns et al., 2016; Steacy et al., 2016; Steacy et al., 2018). For example, Kearns (2015) found that students with greater MA skill were better able to read words with shifts. Therefore, such interactions need to be explored in Arabic in models at the item level. It would also be useful to include a measure of children’s knowledge of the roots of each word on the reading test (Kearns et al., 2016).

There are some limitations in the present study that should be acknowledged. First, the children in the current study came from one educational system (UAE), so generalizing from these results should be done with caution. The UAE has its own spoken dialect that is different from the dialects spoken elsewhere, and this could have played a role in the children’s reading due to some linguistic interference (e.g., vocabulary) from the spoken dialect. Future studies should draw samples from multiple Arabic-speaking countries and educational systems. Second, the orthographic form and root type frequencies were taken from Aralex (Boudelaa & Marslen-Wilson, 2010), which is based on an Arabic dictionary and a corpus primarily related to adults. Frequency counts based on a corpus of children’s books may lead to different results. Future efforts should be geared towards developing a large corpus from children’s books. Furthermore, future research should include word pattern frequencies as these may also affect word reading. A third limitation is that the findings are tied to the words that were selected in the current study. It is possible that a different set of words that did not include clitics or multiple morpheme suffixes would yield somewhat different results.

In conclusion, this study sought to identify the role of word features in word reading accuracy. Findings showed that number of morphemes and morphological awareness were the strongest factors in children’s reading accuracy. These results should encourage future studies using item-level analyses, particularly as a part of intervention studies (e.g., see Steacy et al., 2016). These results also support the explicit inclusion of morphological instruction in early Arabic literacy programs (Tibi & Kirby, 2018b; Wofford & Tibi, 2018). All of this should help children become more aware of the morphological structure in their language and contribute to improved reading development.