How do word recognition skills develop and become automatized? Children with good reading abilities seem to form efficient word recognition skills as a result of formal education and reading practice, but what can be done to improve word recognition skills of children with reading deficits? Relatively little is known about the course of development of orthographic representations in children’s reading (e.g., Landi, Perfetti, Bolger, Dunlap, & Foorman, 2006). The present study is an attempt to find some answers to the above question in a language with a regular orthography. We are especially interested in developing training programs that are aimed at improving word recognition speed of poor readers and that are easy to implement in educational settings.

The research literature shows growing evidence that reading problems, at least in more transparent orthographies (like Dutch, German, or Finnish), are manifested as slow reading speed rather than inaccuracy (Holopainen, Ahonen, & Lyytinen, 2001; Landerl, Wimmer, & Frith, 1997; Wimmer, 1993; Wimmer, Mayringer, & Landerl, 1998; Yap & van der Leij, 1993; Ziegler, Perry, My-Wyatt, Ladner, & Schulte-Körne, 2003; Zoccolotti et al., 1999). In the research literature on English, the emphasis has typically been on word recognition accuracy, although in recent years the relevance of reading speed has also been noted (e.g., Kame’enui & Simmons, 2001; Ziegler et al., 2003). Concepts closely related to reading speed are reading automatization and reading fluency. The concept of reading fluency is more broadly defined, involving both decoding accuracy and rate (Torgesen, Rashotte, & Alexander, 2001) and the appropriate use of prosodic features and text phrasing (Kuhn & Stahl, 2003; Wolf & Katzir-Cohen, 2001). Theoretical models of reading skills acquisition and automatization have suggested that, when word processes become automated, more cognitive resources are available for higher level processes, like comprehension (LaBerge & Samuels, 1974; Perfetti, 1985). There is empirical evidence showing that increasing reading fluency also improves comprehension (Kuhn & Stahl, 2003; O’Connor, White, & Swanson, 2007). Some researchers have suggested that fluency is a prerequisite for the primary purpose of reading, understanding the meaning of a text (Kuhn & Stahl, 2003); however, empirical evidence concerning the causal relation is scarce (O’Connor et al., 2007; Wolf & Katzir-Cohen, 2001).

One important aspect in investigating reading speed is that slow reading seems to be a persistent handicap. For example, in longitudinal studies, high stability in the development of reading fluency skills has been found (de Jong & van der Leij, 2002; Klicpera & Schabmann, 1993; Landerl & Wimmer, 2008). Furthermore, intervention studies (Thaler, Ebner, Wimmer, & Landerl, 2004; Torgesen et al., 2001) have indicated that low reading fluency shows high stability, whereas accuracy scores can be significantly enhanced (Torgesen et al., 2001). Therefore, further investigation on training reading fluency is warranted.

It has been suggested that, after initial reading acquisition (after breaking the alphabetic code to be able to recode novel words) and through practice, readers seem to be able to form lexical entries containing specific information about the orthographic structure of words and use these larger orthographic units in word identification (Reitsma, 1983; S. E. Shaywitz & B. A. Shaywitz, 2005). The growth of orthographic knowledge is partly word-specific; in other words, this growth has an effect on the increase in the number and quality of individual word representations (Perfetti, 1992; Reitsma, 1983) and partly occurs through the generalization of letter–phoneme connections across large portions of the lexicon (Landi et al., 2006; Perfetti, 1992). Martens and de Jong (2006) showed that the acquisition of orthographic knowledge in the reading of children with typical reading skills stems from the ability to rely on multiletter features. By contrast, children with dyslexia rely on an extremely slow and serial grapheme–phoneme decoding process and do not seem to be able to process larger orthographic units efficiently (Di Filippo, de Luca, Judica, Spinelli, & Zoccolotti, 2006; Spinelli et al., 2005; Ziegler et al., 2003).

A critical prerequisite for growth in orthographic knowledge is the use of phonological recoding as a self-teaching device (Share, 1995). Through translating print into phonology, children acquire orthographic information on the words they encounter. However, as the main problem that dyslexics have is a slow serial grapheme–phoneme decoding process, they need additional practice to make the transition from laborious decoding to efficient and fast visual word recognition. This transition should occur through extensive practice, in other words, providing successive exposure to print (e.g., Kuhn & Stahl, 2003). The most commonly used method for enhancing reading fluency has been repeated reading that consists of the repetition of reading material (words or a passage). Typically, the number of rereadings has been predetermined, or material has been reread until a certain rate criterion has been attained. In addition to repeated reading practice, there have been few attempts to simply increase the amount of reading activities for poor readers by assisted reading practice (e.g., Baker, Gersten, & Keating, 2000; Kuhn & Stahl, 2003; O’Connor et al., 2007; Shany & Biemiller, 1995). Assisted reading practice refers to a training strategy in which the emphasis is on providing extensive exposure to the print with a model of fluent reading (Kuhn & Stahl, 2003).

Two lines in the previous research findings are relevant in designing the present study. The training studies on the repeated reading of single words have revealed that repetitions of words or pseudowords can enhance the reading speed of poor readers (Berends & Reitsma, 2006; Judica, De Luca, Spinelli, & Zoccolotti, 2002; Lemoine, Levy, & Hutchinson, 1993; Levy, Bourassa, & Horn, 1999; Martin-Chang & Levy, 2005; Thaler et al., 2004; Wentink, van Bon, & Schreuder, 1997). However, when this kind of training has been targeted at the poorest readers, they have not attained the level of average readers (e.g., Thaler et al., 2004), or the effects have been item-specific; that is, no generalization effects have been shown (e.g., Berends & Reitsma, 2006; Lemoine et al., 1993; Lovett, Warren-Chaplin, Ransby, & Borden, 1990; Thaler et al., 2004). If the repeated reading of words does not lead to generalization effects, this type of training as a remedial one-to-one tutoring program is a time-consuming task. On the other hand, assisted reading practice has been associated with gains in reading fluency and in reading comprehension scores (Baker et al., 2000; O’Connor et al., 2007; Shany & Biemiller, 1995). However, only a few studies have used experimental designs, and the positive outcomes have emerged as a result of extensive practice (e.g. 32 to 98 h). Assisted reading practice using volunteer tutors could offer a way to implement training in schools quite easily; however, little is known of its efficacy, especially in a language with a regular orthography.

In the present study, we are interested in evaluating the outcomes of two training programs: computer training aimed at increasing the efficiency of access to multiletter, sublexical units and paired reading aimed at encouraging poor readers to spend time in reading activities. Computer training is based on the findings of repeated word reading practice and on the notions that efficient associations between orthographic and phonological representations are critical in word recognition (Van der Leij & van Daal, 1999; Vellutino, Fletcher, Snowling, & Scanlon, 2004). During computer training, a large number of presentations with practice in associating the relevant orthographic unit with the corresponding phonological unit are offered. In the task, a child hears a phonological unit through headphones and clicks the corresponding orthographic stimulus, selecting from a number of options presented on the screen. This type of training enables independent practice, as the continuous presence of an adult tutor is not required. Only a few studies have used this type of training with a focus directly on the intra-modal associations between phonological and orthographic correspondences. In a previous study, Hintikka, Aro, and Lyytinen (2005) used the same type of training to practice the grapheme–phoneme associations of Finnish-speaking first graders with poor prereading skills. The intervention induced significant gains in letter knowledge, compared to the performance of a control group, but there were no differential outcomes in reading acquisition. However, it should be mentioned that the intervention period was short (mean 170 min in total). In another study, Magnan, Ecalle, Veuillet, and Collet, (2004) trained a group of poor readers for 5 weeks (altogether 10 h) with a computer program in which the requirement was to listen to a syllable and choose a corresponding written form from two options, and they found significant gains in word recognition after the training period.

The trained multiletter, sublexical items in computer training are word-initial consonant clusters that are followed by a vowel to ensure pronounceability, like kla or stre. Harm, McCandliss, and Seidenberg (2003) suggested that interventions targeted at the whole-word level or at the sublexical level could be equally effective. Given that the acquisition of orthographic knowledge could derive from the ability to rely on multiletter features (Martens & de Jong, 2006) and the item-specific effects of word-naming intervention studies (e.g., Thaler et al., 2004), training of sublexical units could offer a way of achieving a bigger payoff in terms of transfer effects. If children learn to recognize these multiletter units more automatically, it is a step forward from the serial decoding strategy. In this paper, word-initial consonant clusters are chosen as training items, because they occur frequently in German and have been shown to be difficult for dyslexic children (Bruck & Treiman, 1990; Snowling, 1981; Treiman, 1991). However, training in sublexical units is of little relevance if children are unable to apply this acquired knowledge in word recognition. Faster and more accurate word recognition is a prerequisite for growth in reading fluency (e.g., Perfetti, 1992).

In the present study, a group of children participating in a text-reading intervention is included as a comparison group. This group of participants is selected from a “paired reading” project that was conducted at the same time as computer training at the University of Salzburg (Landerl & Moser, 2006). The paired reading method is, as suggested by Kuhn and Stahl (2003), a nonrepeated assisted-reading approach, in which an adult works one-on-one with a poor reader. Stanovich (1986) argued that poor readers choose not to read because reading is unrewarding to them and, as a result, they increasingly fall behind in developing fluent reading skills. In the present study, the paired reading intervention is aimed at keeping the child reading. Thus, the emphasis is on the easy implementation that is created by minimal demands on teachers, using adult volunteers as tutors and keeping the intervention loosely structured so that it is possible to conduct tutoring with brief volunteer training. In the research literature, there is evidence suggesting that paired reading or one-to-one tutoring can lead to gains in word recognition and comprehension (Erlbaum, Vaughan, Hughes, & Moody, 2000; Kuhn & Stahl, 2003; Morgan & Lyon, 1979; O’Connor et al., 2007), also using volunteers as tutors (Baker et al., 2000). We are interested in evaluating whether improvements emerge when reading speed is trained with a relatively short but intensive paired reading program.

The choice of a control group in intervention studies is always a sensitive and hazardous enterprise. To control for the instructional methods and educational context, the investigators need to select from the same classroom as that of the trained group control children with similarly poor reading skills but without providing any training for these children. However, such a method of selection involves ethical and practical dilemmas, as it is difficult to decide which children should receive training and which not. Additionally, the use of a control group without any training as a comparison to the intervention condition is problematic because of the “Hawthorne” effect: Are the possible changes merely the result of devoting more attention to the intervention group or are they the direct result of the specific training? Ideally, to control for attention and retest effects, an intervention study should include two control groups: one without training and one with another type of practice. However, conducting such an intervention study for poor readers requires a lot of resources, and therefore, relatively few reading fluency intervention studies have been done with experimental designs using proper control groups. We decided to contrast the outcomes of two different kinds of training programs, which could act as comparison groups to each other. This way we would be able to control for attention and retest effects; however, the design shows its limitations if the two methods induce equal outcomes. In that case, it is difficult to determine which element of training accounts for the gains or to exclude the possibility of a retest effect.

The hypotheses and research questions in the present study are the following: (1) Specific computerized training by providing a large number of repetitions at the subword level leads to gains in the reading of words with the trained consonant clusters. We expect that specific computerized training will be associated with better gains in the reading of words with the clusters than nonspecific paired reading practice. (2) The growth of reading skills relies strongly on the increase in the number and quality of individual word representations (Perfetti, 1992; Reitsma, 1983); hence, a computerized training program focused on a limited number of sublexical items is hardly likely to have an effect on reading fluency skills in the case of untrained words. Previous studies on tutoring programs (Baker et al., 2000; Kuhn & Stahl, 2003; Shany & Biemiller, 1995) have shown that paired reading practice could lead to improvements in global reading fluency. Thus, we expect that in the global reading fluency task, the paired reading group will show larger improvements than the computer group. If an improvement is seen in the reading of words, it indicates the growth of word representations, whereas an enhancement in pseudoword reading will indicate that the efficiency of grapheme–phoneme conversion has increased. (3) We want to examine whether the age of the participants or their cognitive–linguistic skills (nonverbal reasoning skills, short-term memory, phoneme awareness, and rapid naming) are associated with reading improvement during the training period, as in the research literature, these skills have predicted response to training (e.g., Lovett & Steinbach, 1997; Torgesen & Davis, 1996; Torgesen et al., 1999). More specifically, a strong associate with reading fluency is the rapid naming skill (e.g., Holopainen et al., 2001; Wolf & Bowers, 1999), and, at least in the early stages of reading development, poor readers who are slow namers show slower growth in learning new words than do poor readers who are fast namers (Levy et al., 1999).

Method

Participants

The participants were selected among 412 second and fourth graders from five elementary schools in Salzburg in Austria. The first criterion to selection was reading performance below one standard deviation compared to the age group norm on a classroom reading fluency task developed by Mayringer and Wimmer (2003). In the test, the main criterion was how many sentences children could read silently within a limited time of 3 min. The task of the child was to decide whether the sentences were true or not. Sentences were semantically and syntactically simple (e.g., Erdbeeren sind ganz blau, “Strawberries are very blue”) so that comprehension requirements were minimal. In addition, in individually administered oral reading tests, the children had to exhibit reading speed performance of below the 25th percentile on two out of three subtests (high frequency words, a short text, and pseudowords) of the “Salzburger Lese- und Rechtschreibtest” (Landerl, Wimmer, & Moser, 2006).Footnote 1 A further inclusion criterion was that the teacher reported that the child had reading problems in a regular classroom situation.

After the assessment, 40 poor readers were selected for the study. The group of computer-trained children consisted of 19 children, all of whom attended the same elementary school. The data from one pupil were excluded because she was absent from school for more than 7 days after training and before the posttest; thus, the final sample was 18. Seven children in this group attended the second grade (three boys and four girls) and 11 children the fourth grade (nine boys and two girls). The mean age in the computer group was 9; 8 years (SD 1; 5 years). The mean age of the children in Grade 2 was 8; 1 year (SD = 0; 7 years), and in Grade 4, it was 10; 8 years (SD = 0; 8 years).

The paired reading group had 21 participants from four schools. Eleven children attended the second grade (five boys and six girls) and ten children the fourth grade (six boys and four girls). The mean age in the paired reading group was 10; 5 years (SD 1; 9 years). The mean age of the children in Grade 2 was 9; 3 years (SD = 1; 1 years), and in Grade 4, it was 11; 8 years (SD = 1; 2 years).

All children were instructed within regular classroom settings. Participants with German as a second language had received all their formal education in German (however, one participant had not received all his formal education in German, but he had been more than 2 years in German education, and the teacher evaluated his German proficiency as good). In the context of the present study, no measures of German proficiency were administered. The detailed background reading and cognitive characteristics of the groups are presented in Table 1. Statistical analyses of these measures revealed that there were no significant differences between the computer and the paired reading group in any of the measures (all ps > .07).

Table 1 Background reading and cognitive characteristics (means and standard deviations) of children by group

Design and materials

The study consisted of a pretest, a training period, and a posttest. The pretest was carried out at the beginning of April. The training period lasted 6 weeks starting in the middle of April. It was planned that, during these 6 weeks, the participants attend 25 sessions. However, the number of sessions per participant varied between 23 and 26, as a couple of children was frequently absent from school on several occasions. In the computer group, the average number of sessions was 24.9 (SD 0.5) and in the paired reading group 25.5 (SD 0.7). The duration of each training session was 15 min. The posttests were conducted after the training period at the end of May/beginning of June.

Training program and procedures

The computer group

The computerized training program was developed at the University of Jyväskylä (see Hintikka et al., 2005 or Lyytinen, Ronimus, Alanko, Poikkeus, & Taanila, 2007, for a description of the program). The original goal of the program was to enhance the accuracy of processing for phonemic sounds and, more importantly, to learn to connect these fluently to their orthographic equivalent. A single auditory stimulus was delivered (via high-quality headphones) concurrently with a number of orthographic items (target and distractors) that appeared at the top of the screen embedded within ‘balls’. The balls immediately began to drop downward on the computer screen, and the player’s task was to home in on the relevant orthographic item and to ‘catch’ it by clicking the mouse. If the player did not catch the correct spelling before the ball hits the ground or erroneously clicked on the incorrect spelling, the target item was repeated in the next trial, and the correct response was color-highlighted. Each level was played in both uppercase and lowercase letter formats.

With the emphasis on adaptation, the number of orthographic alternatives (distractors) and the speed at which the balls fell were initially set very low. However, as the game proceeded, the number and rate factors were automatically adjusted in keeping with the developing level of the individual player. The program also recorded data on the progress of each child, thus allowing for continuity of subsequent levels of difficulty.

The program used in the present study was translated into German and revised for the purposes of the study. The stimulus material consisted of 44 multiletter, sublexical items. These items were formed from high-frequency word-initial consonant clusters (kr-, fl-, schl-, and str-) with a vowel added (e.g., kra, fle, schlü, and stro, see Appendix). The same onset clusters have been used as training material in a previous study (Thaler et al., 2004). There were altogether three levels of training with varying speed requirements. These three levels were divided into three sublevels, and the instruction was to practice through each sublevel at least twice before proceeding to the next one. The participants passed a level when they recognized the correct target items three times in a row. The first sublevel consisted of onset clusters with one of five vowels (syllables with -a, -e, -i, -o, and -u), the second sublevel consisted of the items practiced at the first sublevel plus consonant clusters with diphthongs (-au, -ei, and -eu), and the third sublevel included all the previous items together with clusters with German umlauts (-ä, -ö, and -ü). After passing all the sublevels twice, the children were able to move on to the next level, which was identical except that the items descended at an accelerated speed. Presentation times of the items (the time between the auditory stimulus and the target hitting the bottom of the screen) varied between the levels and according to the performance of the player. More specifically, at the first level, the presentation time varied between 8 to 15 s, at the second level between 5 to 9 s, and at the third between 4 to 8 s. During the training period, the number of presentations of the items varied between the children due to the adaptation procedures. The total number of presentations for the 44 items for each child during the training period varied between 3,523 and 4,849 (M = 4,230.6, SD = 328.1). We also calculated the means of presentations per target item over the participants and over the three levels: A single target item was practiced a minimum of 39.8 times and a maximum of 139.1 times. The mean of presentations per item was 96.2 (SD = 7.5).

Training was carried out by the first author and an advanced student writing her master’s thesis on the topic of the present study. The children practiced with the program during normal school hours. In the school, there was a separate classroom with several laptops, and as the children used headphones while playing, it was possible to have four children at a time for one session. After the first tutoring session, the players were able to work independently with the program; however, it was ensured that the children followed the instructions and could practice undisturbed.

The paired reading group

A subgroup of participants from a paired reading project that was carried out at the same time as computer training at the University of Salzburg (Landerl & Moser, 2006) was chosen for the present study. The paired reading project was administered by school psychologists and dyslexia therapists at four elementary schools. In each school, there were four to six tutors from the school community (e.g., parents, grandparents) working on a voluntary basis, and each of whom visited the school once or twice per week. The children never worked with their own parents. Before the project started, the tutors were instructed by a school psychologist or dyslexia therapist on the purposes of the study, the typical problems faced by struggling readers, the methods of paired reading to be used, and suitable books for the children. During the project, supervision was available when necessary, and a meeting was arranged where the progress of the project and possible difficulties could be discussed. The contents of paired reading practice were loosely structured, as the main goal was to develop a program that is easy to implement in educational settings. Sessions lasted for 15 min per day. The tutor and child chose the books together, which were always age-appropriate children stories. The volunteer tutors were instructed to try to keep the children reading and attending to the stories being read. Five reading strategies were introduced to the volunteers: (1) the child reads aloud to the tutor, (2) the tutor reads aloud to the child, (3) the tutor and child read together at the same time, (4) the child rereads the sentences after the tutor read them aloud, (5) the tutor and child read silently after which a short discussion on the content of the text can take place. The tutors were instructed to employ at least two different kinds of strategies during one session and that reading aloud for 15 min is often too laborious for a poor reader.

Reading measures

The reading tasks were all preceded by the presentation of practice items. In the individually administered tests, after the practice items, the examiner presented the lists to the child, simultaneously started a stopwatch, and stopped it when the last item had been attempted. If a child was blocked by a particular item, he or she was encouraged to move on and complete the list. However, it is important to note that the participants in the present study were not seriously hampered in reading accuracy; that is, they gave responses to each item, and they were not often blocked by items.

Reading words with word-initial consonant clusters

This test consisted of two word lists arranged vertically on a sheet in two columns. The word lists have been used earlier in a study by Thaler et al. (2004). All words were mono- or bisyllabic and in the vocabulary of an average 8-year-old child (see Thaler et al., 2004). The list of words with trained consonant clusters contained 31 words with four high-frequency consonantal onsets kr-, fl-, schl-, and str- as in the words Kran, fliegen, Schlag, and Stroh, respectively (see Appendix). The number of words reads correctly, and the list-completion time was scored. A parallel version in the posttest included the same words but in a different sequence.

One-minute word and pseudoword reading

Two time-limited reading tasks, in which 144 words or pseudowords of increasing length were presented in upper- and lowercase letters in a list arranged vertically on a sheet in eight columns, were administered to the participants. The task was to read the items as accurately and rapidly as possible within a 1-min time limit. The test has been developed at the University of Salzburg and is currently being standardized (Landerl & Willburger, 2008). The score was the number of correctly read items. Parallel list versions (in which different items were used but they were controlled for frequency and length) were used in the pretest and posttest.

Standardized reading test

These subtests were taken from a standardized reading test, the “Salzburger Lese- und Rechtschreibtest” (Landerl et al., 1997). The test requires a child to read aloud lists of words, pseudowords, and a short text. The total reading time for each subtest was used in the analyses. The error score was the number of items read incorrectly. Parallel list versions (in which different items were used but they were controlled for frequency and length) were used in the pretest and posttest.

Cognitive–linguistic measures

Tests of nonverbal abilities, phonological short-term memory, phoneme awareness, and rapid serial naming were conducted to ensure that there were no significant group differences in terms of cognitive level between the training groups and to analyze the predictive relationships between pretreatment learner characteristics and the gains in reading skills during the training period.

Nonverbal abilities

Nonverbal IQ was assessed with the Raven’s Coloured Progressive Matrices—German version (Bulheller & Häcker, 2002). The test manual reports split-half and test-retest reliabilities. The coefficients for split-half reliabilities vary across different samples between 0.65 and 0.93. Typically, for older children (between 7 and 11 years), the reliabilities are higher. The test–retest reliabilities vary between 0.65 and 0.90. The validity data are reported as relations between different types of intelligence tests. For example, the relationship between the Raven and Hamburger–Wechsler Intelligenztest fuer Kinder—III (HAWIK-III; Tewes, Rossmann, & Schallberger, 2000) varies in different samples of German-speaking children between 0.48 and 0.73 (the correlation between the performance IQ scale and Raven is 0.61).

Short-term memory

The test items were chosen from the German translation of Wechsler Intelligence Scale for Children—III assessment, HAWIK-III (Tewes et al., 2000). In the digit span subtest, the child was required to repeat (forward and backward) a series of spoken digits of increasing length. The score was the number of series repeated correctly, and a standardized score was also given. The test manual reports split-half reliabilities using Spearman–Brown coefficients. The reliability coefficient across different age groups is 0.88. Validity data are not reported separately for subtests in the test manual.

Phoneme awareness

Two phonological subtests were chosen from a standardized German test Basiskompetenzen für, Lese-Rechtschreibleistungen (Stock, Marx, & Schneider, 2003). In the pseudoword segmentation task (eight items), the items contained four to eight phonemes. The pseudoword was pronounced to the child, and then, he or she had to say each phoneme of the word separately. In the initial phoneme deletion task (seven items), the child was instructed to delete the initial phoneme of a pseudoword and to say aloud the remaining part. These pseudowords contained four to six phonemes. For the analyses, the two tasks were combined into a score of ‘phoneme awareness’ (max 15). The test manual reports reliabilities separately for the second and fourth grades. The Spearman–Brown correlation coefficients for split-half reliability are for the pseudoword segmentation 0.57 (second grade) and 0.44 (fourth grade). For the initial phoneme deletion task, the coefficients are 0.72 (second grade) and 0.64 (fourth grade). Validity analysis is reported as criterion based: as a correlation between the overall performance in the phonological test and reading skills. This correlation varies between 0.42 and 0.48, which authors report as relatively high, as the phoneme awareness test is not a direct measure of literacy.

Rapid naming task

The stimuli consisted of digits. An alternative version of the rapid naming task was used (see Compton, Olson, DeFries, & Pennington, 2002). The stimuli were arranged in ten rows in random order (altogether 50 presentations of five different items) with no successive presentations of the same stimulus item. The children were asked to name the items as rapidly as possible. Before the test proper, the child was issued with practice items to ensure that he or she was familiar with the item names. The number of named digits in 20 s was the participant’s score. The Pearson correlation coefficient for test–retest in the sample was 0.89.

Testing procedures

Testing was carried out individually in one session and in a quiet room in school. An advanced student writing her master’s thesis on the topic of the present study carried out the testing. Participants’ responses in the individual reading tests and in the rapid naming test were recorded for subsequent scoring.

Data analyses

A logarithmic transformation was computed for the variable that measured reading speed of words with the consonant clusters, as it was not normally distributed. To analyze gains in global word reading fluency, the individually administered word reading tasks (1-min reading task and lists of frequent words, compound words, and the short text) were combined into a single variable based on the number of syllables read correctly in 1 min. The number of syllables in 1 min was chosen as a measure, as it enabled the comparison of the results with previous findings among typical German-speaking readers (Wimmer & Mayringer, 2002). The correlations between the individual reading tasks varied between 0.55 and 0.86. The Cronbach’s alpha for the combined word reading fluency variable was at the pretest 0.86 among second graders and 0.87 among fourth graders. At the posttest, the Cronbach’s alpha for the combined word reading fluency variable was 0.81 among second graders and 0.87 among fourth graders. The pseudoword reading tasks were similarly combined. The correlations between individual reading tasks varied between 0.64 and 0.82. The Cronbach’s alpha for internal consistency at the pretest was 0.86. At the posttest, the Cronbach’s alpha was 0.90.

There were no statistically significant interaction effects involving grade (second and fourth). The fourth graders were in most tasks faster than the second graders. However, we were interested in the development of reading skills during the training period. As no differential responses occurred as function of grade; that is, the amount of improvement did not differ, we decided not to include grade in the final analyses.

Results

The results are presented in three main sections that are divided into subsections. In the first main section, the gains in reading words with the clusters, which were the specific focus of the computer training, are reported. In this section, the first subsection includes the comparison of gains for the computer group and the paired reading group. The second subsection introduces a post hoc analysis of the gains in the consonant cluster segments and only for the computer group. The second main section includes the results in terms of gains in global reading fluency (word and pseudoword lists). It was hypothesized that paired reading practice could lead to improvements in global reading fluency. Additionally, in the third section, we present the associations between the pretreatment learner characteristics and reading gains.

Gains in reading words with consonant clusters

Comparison of pre- with posttest performances

Scores of accuracy in reading words with the consonant clusters were subjected to analysis of variance with group (computer and paired reading) as the between-subjects factor and test session (pretest and posttest) as the within-subjects factor. The results showed that the groups improved their accuracy in reading (see statistics in Table 2 and descriptive statistics in Fig. 1) and that the groups performed at a similar level. The computer and paired reading groups exhibited similar development, as the test session × group interaction was not statistically significant.

Fig. 1
figure 1

Mean percentage accuracy and times (seconds per item) in the pre- and posttests by group for the words with the consonant clusters. The vertical lines depict standard errors of the means

Table 2 Means (and SD) by group in reading words with consonant clusters at two assessment points

The reading times for the word list with the consonant clusters were subjected to analysis of variance with group (computer and paired reading) as the between-subjects factor and test session (pretest and posttest) as the within-subjects factor. The test session effect was significant (see statistics in Table 2 and descriptive statistics in Fig. 1). The main effect of the group was not statistically significant. The computer and paired reading groups did not differ from each other in the development of reading speed, as the test session × group interaction was not significant.

To summarize, for the whole group of trained participants, a positive improvement in the reading accuracy and speed of words with the consonant clusters emerged; however, the groups did not show differential improvement.

Gains in the computer program

Contrary to our expectations, significant effects favoring the computer group in reading words with the clusters were not found. Therefore, we decided to conduct a post hoc analysis to examine whether the children in the computer program improved at the sublexical level, which was the specific target of the computer program. Log files of the computer program were analyzed to examine whether there was an improvement during training in recognizing the written form of auditory stimuli. It should be noted that, although accuracy measures are reported, owing to the timing of the presentations, the tasks also required rapid responses. As a measure with which to analyze the improvements, the percentages of correct responses for each consonant cluster were calculated for three subsequent trials (for the first three trials at the beginning of the level and for the last three trials at the end). Using three subsequent trials rather than a single trial allowed a more reliable assessment. It must be noted that, owing to the adaptivity of the software, during training, the number of written forms (distractors) varied between two to nine, and at the start of a level, there were fewer distractors than at the end of the level, which imposed a greater challenge during the final trials.

Due to the adaptivity of the program, the children started with a considerably high level of accuracy (mean percentage accuracy for 44 consonant clusters 88.6%; see Fig. 2). During the first level, at the slowest presentation rate, the total number of trials per child was on average 1,690.4 (SD = 499.5). At the end of the first level, the mean level of accuracy was 96.4 %. The Wilcoxon signed-rank test showed that this improvement in the accuracy of associating the auditory stimuli with the orthographic items was significant, Z = −3.62, p < 0.001. One of the participants, a girl in grade 2, made many errors during training and did not attain the performance criterion (target item recognized correctly three times in a row) set for proceeding to the higher levels. She played during the training period with the first level and was not included in the analyses of the second level, in which the presentation rate was faster. During the second level, the total number of trials per child averaged 1,659.3 (SD = 523.0), and the mean percentage accuracy rose from 92.2% to 95.2%. This improvement was statistically significant, Z = −2.43, p < 0.05. The majority of the participants reached the third level, but the number of trained trials was lower than during the first two levels, averaging 880.9 (SD = 470.2), as the training period finished. During the third level, the percentage accuracy increased slightly (from 92.9% to 93.4%), but this change was not significant, Z = −0.62, p > 0.50.

Fig. 2
figure 2

Mean percentage accuracy for the consonant clusters in the computer group at the beginning and at the end of three levels with increasing speed. The dashed line shows the reading performance in the tests. The vertical lines depict standard errors of the means

The alternative way to view improvement in the trained items was to analyze whether the reading of the trained consonant cluster parts of words increased in accuracy along with the tests. An increase in accuracy from the pretest (91.3%) to posttest (96.5 %) was found, and this change was statistically significant, Z = −2.56, p < 0.05 (see Fig. 2). To summarize, during the computer program, the children learned to select more correct target items and speeded up their responses; in addition, accuracy improved in reading the trained items (included in words) aloud during the tests.

Global reading fluency gains

Word reading fluency

The scores for the number of syllables read correctly in 1 min were subjected to analysis of variance with the group (computer and paired reading) as the between-subjects factor and test session (pretest and posttest) as the within-subjects factor. The participants improved their word reading skills (see statistics in Table 3 and descriptive statistics in Fig. 3). The main effect of the group was not significant. The critical test session × group interaction was found to be significant, F (1, 37) = 4.46, p < 0.05, \(\eta _{\text{p}}^2 = 0.11\). The paired reading group showed more rapid improvement than the computer group during the training period in the number of syllables read correctly in 1 min (on average, the paired reading group gained 9.1 syllables and the computer group 3.9 syllables). However, the posttest performances (between 71.7 and 74.8 syllables in 1 min) were clearly below the average level of children with typical reading skills. Wimmer and Mayringer (2002) reported an average word reading rate of between 166 to 186 syllables per minute in 9- to 10-year-old German-speaking readers.

Table 3 Means (and standard deviations) by group in global reading fluency at two assessment points
Fig. 3
figure 3

Mean reading times by group at the pre- and posttests for words (syllables read correctly in 1 min). The vertical lines depict standard errors of the means

Pseudoword reading fluency

The scores for the number of syllables of pseudowords read correctly in 1 min were subjected to analysis of variance with group (computer and paired reading) as the between-subjects factor and test session (pretest and posttest) as the within-subjects factor. The analysis revealed that the groups did not improve in their pseudoword reading during the training period (see statistics in Table 3 and descriptive statistics in Fig. 4). Neither the main effect of the group nor the test session × group interaction was significant (Table 3).

Pretreatment learner characteristics in relation to reading gains

Additionally, we were interested in examining whether certain pretreatment learner characteristics, the age of the participants or their cognitive–linguistic skills (nonverbal IQ, short-term memory, phoneme awareness, and naming speed of digits), would be associated with the gains in speed of reading words with the consonant clusters and in word reading fluency. In addition, we also checked whether the pretreatment learner characteristics would have associations with the posttest reading speed. The correlation coefficients are reported in Table 4. Only one of the background variables or cognitive skills showed significant associations with the gains in reading words with the consonant clusters. Nonverbal reasoning abilities, measured by the Raven, showed the strongest correlation with the gain in word reading speed, r s (39) = −0.35, p < 0.05. The improvement was larger, if the child had high scores in nonverbal IQ. However, this association was influenced by the performance of only one child and, therefore, should be interpreted with caution. Contrary to our expectations, rapid naming speed was not associated with the gains in the speed of reading words with the consonant clusters. However, rapid naming speed was associated with the gains in word reading fluency, r s (39) = 0.33, p < 0.05; that is, the children with slow naming skills showed a lower gain in word reading fluency than the children with faster naming skills. All the other correlations were nonsignificant. The age of children and rapid naming skills were associated with reading speed at the posttest. Younger children and children with slow naming skills had slower reading speed at the posttest (Table 4).

Fig. 4
figure 4

Mean reading times by group at the pre- and posttests for pseudowords (syllables read correctly in 1 min). The vertical lines depict standard errors of the means

Table 4 Spearman correlation coefficients between pretreatment learner characteristics, gains during training and posttest reading speed

Discussion

In the present study, we were interested in evaluating the outcomes of two training programs for poor readers: computer training aimed at increasing the efficiency of access to multiletter, sublexical units (word-initial consonant clusters) and paired reading aimed at encouraging children to spend time in reading activities. The aim in both programs was to improve word recognition skills, especially the speed of word recognition. For the computer group, the advantage was hypothesized to be seen in reading words with the consonant clusters and for the paired reading group in global word reading fluency. The results showed that, in reading words with the consonant clusters, both groups exhibited a similar improvement in accuracy and speed. A post hoc analysis was conducted to examine whether the children in the computer program improved at the sublexical level, which was the specific target of the computer program. It was found that the computer training was associated with better reading of the trained sublexical items. In terms of global reading fluency skills, the children in the paired reading group improved more than the children in the computer group. Neither of the groups improved their pseudoword reading skills.

A central hypothesis of the present study was that computer-assisted practice, by providing a large number of repetitions at the subword level, could lead to a generalization effect on reading accuracy and speed with respect to words containing the trained clusters. In the present study, reading accuracy levels were high (varying between 86.7–93.5%). These results are in accordance with earlier findings that have shown high accuracy levels, even for children with dyslexia, in more transparent orthographies (Holopainen et al., 2001; Landerl et al., 1997; Wimmer, 1993; Zoccolotti et al., 1999). We indeed found gains in reading accuracy, but they were moderate and not specific to the computer group. Reading accuracy was not the main interest of the present study; however, it is important to note that there was no trade off between speed and accuracy.

In terms of effects on the speed of reading word-initial consonant cluster words, we expected that the specific computerized training program would lead to gains, as the average number of presentations of consonant clusters during computer training varied between 40 to 139 times (M = 96.2, SD = 7.5), and during training, emphasis was placed on the speed of responding. In earlier studies, an improvement in the speed of reading trained words has been observed after five to 15 exposures (repeated reading) to the words (Levy, 2001; Reitsma, 1983), although such a rapid orthographic learning may not emerge in the reading development of children with reading deficits (Share, 1999). The analyses revealed a significant improvement in the reading speed; however, no statistically significant differences between the groups emerged. That is, the paired reading group and computer group developed similarly in reading of words with the consonant clusters.

To summarize, after on average 96 presentations of each consonant cluster, in practicing the associations between the orthographic and phonological units, the computer group showed only moderate speed-related improvements when reading words with the trained sublexical items. The results indicate, therefore, that we were unable to significantly increase the efficiency of access to the multiletter units when these units were included in the words. We decided to conduct a post hoc analysis to examine whether the children in the computer group improved at the trained sublexical level. The analysis was based on the log files of the computer program; in addition, we examined whether the reading of the trained consonant cluster parts of words increased in accuracy along with the tests. Computerized training was indeed associated with gains during the training period in the accuracy of the correspondences perceived between the orthographic and phonological units, despite the increase in the number of orthographic alternatives and the acceleration in speed. In addition, accuracy in reading the consonant clusters aloud improved significantly from the pretest to posttest. However, these improvements did not show large generalization effects to reading of words in which the consonant clusters were included.

In considering the results in terms of the reading speed of words with onset consonant clusters, a few comments need to be made. In the present study, in examining the training-induced gains, we used a list-reading task as a dependent measure. As a result, two kinds of transfer effects emerged in this reading task: (1) from the computer task to reading aloud and (2) from the sublexical level to the word level. In the following, these transfer effects are discussed more thoroughly.

First, the computer group participants had to make a transfer from selecting the relevant orthographic stimuli corresponding to the phonological item to the pronunciation of or sounding out those items in the reading task. The list-reading task did not therefore have such a high degree of congruity with the computer task; however, we were interested in analyzing the outcomes of training in an authentic reading task. The selection procedure in the computer task is comparable with a process of identification in executing cognitive tasks, whereas in reading items aloud, accurate and fluent production is required. Generally, in cognitive tasks, identification can be considered as a lower level of mastery of skill than accurate and fluent production. On the basis of the accuracy results of the trained items at the sublexical level (the post hoc analysis), this transfer to sounding-out was not a particular problem, as in both tasks (selection and pronunciation), an improvement was found. In addition, in another study, Hintikka, Landerl, Aro, and Lyytinen (2008) decided to contrast practice in reading aloud with the training of the associations between the phonological and orthographic units. It was found that these two training types were equally effective in terms of reading outcomes; thus, association practice can produce positive effects on reading aloud, on production task.

Second, in the present study, there was a requirement to generalize from sublexical consonant clusters to word reading. Earlier word-naming studies showed that the outcomes of the naming practice were item-specific (Berends & Reitsma, 2006; Lemoine et al., 1993; Lovett et al., 1990; Thaler et al., 2004). However, such studies have usually included orthographic neighbor words as control words, and studies investigating generalization from the sublexical level to the word level are lacking. In the previously mentioned study by Hintikka et al. (2008), a generalization effect from the sublexical level to word reading was found. Two differences occurred between these studies: In the study by Hintikka et al. (2008), the training program was designed to emphasize more clearly fast responding, and the experimental reading task was more congruent with the training program than in the present study. Martin-Chang, Levy, & O’Neil (2007) note that the transfer increases as the congruency between the training and transfer tasks increases, that is, when the same cognitive processes are engaged during training and transfer tasks. In the light of the transfer-appropriate processing hypothesis (Martin-Chang et al., 2007), it might be that the effectiveness of the computer program was subjected to a stringent test, as in examining the outcomes of the program, two transfer steps occurred simultaneously. It is easier to get positive outcomes when the experimental task is congruent with the training program. On the other hand, training is of little relevance when the trained skills do not extend to everyday reading. The findings of the present study can be discussed by drawing a distinction between proximal and distal effects. With respect to proximal outcomes, the effects in the computer task itself, an improvement was found. However, when considering the distal outcomes of the computer task, a list-reading task that did not resemble the training settings, the improvement was small. On the basis of the present findings, the computer task was not particularly effective in producing distal effects. Further studies are needed to examine the transfer issues and the role of accelerated presentation speed in training word recognition fluency (see also Berends & Reitsma, 2006).

We also looked at the development of general reading fluency skills, as on the basis of previous studies on tutoring programs (Baker et al., 2000; Kuhn & Stahl, 2003; O’Connor et al., 2007; Shany & Biemiller, 1995), paired reading practice could be associated with an improvement in global reading fluency tasks. In the present study, it was found that, after book reading practice for approximately 6 weeks (25 sessions), the participants in the paired reading group improved their reading speed of words more rapidly than the computer group. Thus, the results confirm the previous findings of the positive outcomes of tutoring programs on reading fluency. Furthermore, Baker et al. (2000) and O’Connor et al. (2007) used tutored reading practice with limited tutor training and found gains in reading fluency and comprehension. This positive finding of tutored reading practice applies also for the children with reading deficits in a language with a regular orthography whose main problem is slow reading speed. The positive effects were found after a relatively short training period; however, it is clear that, in 6 weeks, the reading deficits cannot be remediated. The posttest performances of the participants were still clearly below the average level (see Wimmer & Mayringer, 2002).

Recent conceptions about the growth in children’s word recognition suggest that, after the initial phases of reading acquisition, the learning effects in reading are mainly based on the accumulation of knowledge about individual words or word representations (e.g., Berends & Reitsma, 2006; Perfetti, 1992; Share, 1999) and that the specific problem of dyslexics is not so much in gaining orthographic access to whole words as it is in computing sublexical phonology (Van der Leij & van Daal, 1999; Ziegler & Goswami, 2005). Overall, the results of the present small-scale intervention study lend additional support to these findings, as exposure to print did not enhance a rapid phonological recoding strategy, the reading of pseudowords, but helped pupils to recognize words more rapidly. This type of improvement in word recognition efficiency is an essential prerequisite for the primary purpose of reading, the construction of meaning from the text and, thus, an important finding from an instructional point of view.

Additionally, we were interested in whether the age of the participants or certain cognitive–linguistic skills (nonverbal IQ, short-term memory, phoneme awareness, and naming speed of digits) were associated with the gains in reading speed, as in the research literature, some of these skills have predicted response to training (Lovett & Steinbach, 1997; Torgesen & Davis, 1996; Torgesen et al., 1999). In terms of reading fluency, a strong association has been found with rapid naming skill (e.g., Holopainen et al., 2001; Wolf & Bowers, 1999), and it has been shown that, at least in the early stages of reading development, poor readers who are slow namers show slower development in learning new words compared with poor readers who are fast namers (Levy et al., 1999). In the present study, the nonverbal reasoning abilities had a statistically significant association with the gain in reading words with the consonant clusters. However, this relationship was strongly influenced by only one student. Rapid naming speed was not associated with the gains in the reading speed of words with the consonant clusters during the intervention period, which might be owing to the specific nature of reading words with the consonant clusters and to the short intervention period. However, rapid naming speed was associated with the gains in global word reading fluency; that is, the children with slow naming skills showed a lower gain in word reading fluency than the children with faster naming skills, which is consistent with the findings of Bowers (1993), Levy et al. (1999), and Holopainen et al. (2001). No other pretreatment character was related to the gains in word reading fluency. It is important to note that, although age was associated with the reading speed at the posttest (older children being faster readers than younger participants), age did not show significant correlations with the gain in reading speed. In addition, the lack of significant interaction effects between test session and grade and between test session, group, and grade indicate that the amount of improvement was similar across the grades. This result is consistent with the findings of O’Connor et al. (2007).

The main limitation of the present study was that we were not able to include a nontraining control group in the study. We decided to contrast the outcomes of two different kinds of training programs that could act as comparison groups to each other. We noted in the introduction that the design shows its limitations if the two methods induce equal outcomes. This was found in reading words with the word-initial consonant clusters. As a control group receiving only regular school instruction is lacking, one might ask whether the improvement was a retest effect. It should be noted that the training period was short (6 weeks), and we think that the reading speed of poor readers in grades 2 and 4 improves relatively little as a result of regular school instruction during such a short period. Research on reading speed improvement among German-speaking poor readers is scarce, which makes it difficult to estimate the amount of improvement during 6 weeks of regular school instruction. However, estimates can be obtained from a couple of training studies that have included control groups, which have not received specific literacy training. In the study by Hintikka et al. (2008), children of a control group participated in the tests within a 2-week time interval. Their reading speed remained at a similar level. A study by Schulte-Körne, Deimel, Hülsmann, Seidler, and Remschmidt (2001) included children from grades 2 to 4, and their reading speed development was measured within a time interval of 3 months. If assumed that reading growth proceeds linearly, in 6 weeks, the control group children improved in word reading speed, on average, 0.02 s per word. The computer and paired reading groups of the present study improved in 6 weeks in the reading speed of words with consonant clusters 0.1 to 0.2 s, which is five to ten times larger growth than for the control group in the study by Schulte-Körne et al., (2001). In addition, similar kind of figures of growth can be achieved from cross-sectional data of the standardized reading test (Landerl et al., 2006) and from a longitudinal study (Klicpera & Schabmann, 1993) containing different types of words and passages. On the basis of these two data sets in 6 weeks, the reading speed gain of poor readers is approximately 0.03 to 0.07 s per word. Again, the groups of the present study showed 1.5 to 6.5 times larger growth than what can be expected on the basis of the estimates for reading speed improvement among poor readers receiving regular school instruction. Another limitation in the present study is that the parallel versions of the reading tests were not counterbalanced across subjects. It is not possible to rule out that there might have been a difference in difficulty between the versions in the pre- and posttests indicating more apparent growth than real in global reading fluency. However, for global reading fluency, the most critical result was that the paired reading group improved more than the computer group, for which the same test versions were administered, so we think that the difference between the groups is a real result.

From an instructional point of view, both types of training were designed to be easy to implement and not labor intensive to schools and teachers. The requirements in the paired reading program were kept minimal: Volunteers were used as tutors, and their training was short. However, as adult tutors work on one-to-one basis with children, the recruitment of the tutors and management of the timetables for the adult tutors and pupils require some time and effort. Classwide peer-tutoring as a means of efficient delivery could be helpful, as it is less resource intensive than one-to-one adult tutoring (see Kuhn & Stahl, 2003). Computer training is easy to administer and offers a cost-effective way to implement training to schools. The program used in the present study enables the child to practice independently, and the continuous presence of an adult tutor is not required. An additional benefit of computer-mediated training might be that a play-like format is motivating compared to traditional reading aloud. On the other hand, social interaction in the language learning of infants has recently been shown to play a critical role (for a review, see Kuhl, 2004). The importance of social factors in language learning might be regulated through active participation, attention, and motivation (Kuhl, 2004). The role of social and motivational factors in reading training programs requires more explicit clarification.