Introduction

Developmental dyslexia (hereafter dyslexia) is a persistent reading disorder characterized by inaccurate (or slow and effortful) decoding and reading, as well as poor spelling skills (Lyon et al., 2003). A number of studies have clarified that it is not caused by any of the following conditions: intellectual development disorders, sensory impairment (vision or hearing), neurological or motor disorders, lack of access to education, lack of proficiency in the language of academic instruction, and psychosocial adversity (American Psychiatric Association, 2013). Rather, recent predominant theories have proposed a multi-factorial model of the disorder (Perry et al. 2019), including a predominant role of an underlying phonological processing deficit (Norton et al., 2014).

Developmental dyslexia is known to be a lifelong impairment where a number of symptoms during childhood can persist into adulthood, especially poor reading fluency (Breznitz, 2012; Cavalli et al., 2018; Shaywitz & Shaywitz, 2005; Martin et al., 2010), regardless of language transparency (Landerl & Wimmer, 2008; Paizi et al., 2010). This specific deficiency, formally known as a lack of automatization in written word recognition, is disabling for individuals because it greatly hinders reading comprehension and its subprocesses (Norton & Wolf, 2012; Samuels, 1979). In consequence, it has been demonstrated furthermore that when these reading disabilities become persistent, they lead to reduced socio-emotional wellness of an individual (Livingston et al., 2018; Mammarella et al., 2016; Willcutt & Pennington, 2000).

Targeted therapies for reading disabilities are typically based on diagnostic approaches that evaluate word identification processes (ortho-phonological conversion, i.e., access to the spelling lexicon) in order to define subtypes of dyslexia (mixed, phonological, and surface) and the underlying skills necessary (phonological, visual or visuo-attentional) for the proper functioning of said processes. Remedial interventions therefore aim to tap into these component processes, or isolate several, in order to identify and treat the underlying deficits. Much work still remains however, to develop specialized intervention methods that are effective for a given risk, or disability profile, such as children at risk for developing dyslexia; children already with a phonological and orthographic coding disorder in the early years of learning to read; or children, adolescents, and adults with persistent reading speed deficits.

The foundational studies on children at risk of developing reading disabilities (Hatcher et al., 1994; Hatcher et al., 2004; Vellutino et al., 1996; Torgesen et al., 1992; Torgesen, 1997) have been instrumental in demonstrating the preponderant role of phonological disorders as a causal factor in delayed sublexical pathway development (Coltheart et al., 1993; Coltheart et al. 2001). More specifically, these studies have shown that the phonological coding stage is a prerequisite for learning to read (Hutzler et al., 2004) provided that phonological representations are correctly defined (Goswami & Bryant 1990). This step of coupling spelling and phonetics allows the orthographic lexicon to be fed by a phonological recoding process (Share’s self-learning theory; Share, 1995), and the combination of these two procedures constitutes a powerful bootstrapping mechanism (Ziegler et al. 2014) for learning to read.

However, successful acquisition of phonological encoding skills alone does not guarantee successful development of the orthographic lexicon, especially in languages with opaque spelling (Ziegler & Goswami, 2005). In these languages, children must learn to decode larger units (e.g., rhymes, syllables, words) in order to automate the identification of allographs (i.e., speech-sounds known as allographs have different standard graphic representations, e.g., in French: ai, ei, er, es, et.../ɛ/) or contextual spelling irregularities (e.g., in French: ci/si/vs co/Ko/). In turn, the development of an orthographic lexicon or an orthographic memory, in accordance with the statistical learning of the graphotactic regularities of one’s language (Campbell & Coltheart, 1984; Pacton et al., 2013), makes it possible to reinforce the speed of decoding through orthographic re-coding. Particularly the semantic coding stage, which is associated with the phonological and orthographic coding stages, allows the transition from word recognition to word meaning. In fact, it is acquired according to the classic formula of Hoover and Gough (1990), demonstrating that reading results from the combination of word decoding steps and access to linguistic comprehension.

Thereafter, computational models of reading aloud have classically presented the functioning of these different stages of written word identification by dissociating phonological-sublexical coding from orthographic-lexical coding (Coltheart et al., 2001; Perry et al., 2007, 2014) or models associating simultaneous activation of orthographical, phonological, and semantic coding (Seidenberg & McClelland, 1989; Plaut et al., 1996; Ans et al., 1998; Harm & Seidenberg, 1999). The results of all of these models confirm that successful transition to expert reading ability depends on the functional integrity of the trinity: phonological, orthographic, and semantic coding. Nonetheless, interventions for written word identification disorders are generally based on dual-route models (e.g., Coltheart et al., 2001) with the aim of treating specific deficits in either the graphophonological or orthographic conversion processes. However, most of the original studies that have described interventions for children at risk of learning to read in schools (e.g., Torgesen et al., 1992; Hatcher et al., 1994, 2004; Vellutino et al., 1996; Hindson et al., 2005) are not limited to exercises focusing on phonological awareness and grapho-phonological conversion, but alternate several different exercises that tap into these triple encoding skills, which is more consistent with the connectionist model view (Seidenberg & McClelland, 1989; Plaut et al., 1996; Harm & Seidenberg, 1999). Likewise, interventions which focus on writing and speech coupling in children with dyslexia, or at risk of a learning disability (Ecalle et al., 2009; Fraga González et al., 2015; Mehringer et al., 2020; Saine et al., 2011), also make use of intermodal procedures that combine phonological, visual, and semantic skills. In this respect, and as suggested by the connectionist approach, it would seem advantageous to also consider interventions based on their capacity to account for the parallelism of phonological, orthographic, and semantic coding, and improve their balancing (e.g., address under-/over-utilization).

While these previous interventions have been shown to improve decoding accuracy and performance on phonological awareness tasks in children with dyslexia, these positive effects transfer only very weakly to improved reading speed (Eden et al., 2004; Torgesen et al., 2001). In general, interventions may be classified into two types: ones with adaptive objectives (i.e., teaching reading strategies, such as using sentence context) or curative objectives (i.e., directly treating underlying reading deficits, such as poor phonological and visual-attention skills). Currently, many research-based interventions have more adaptive than curative objectives, and their long-term effectiveness is often debated (Gabrieli, 2009). But some authors, such as Vellutino et al. (1996), also go so far as to question the relevance of the curative interventions that target phonological awareness, as they would benefit children at risk for dyslexia more than the actual dyslexic children themselves; despite the intensive nature of these interventions.

Nonetheless, it is to be expected that said interventions on coding and phonological awareness have demonstrated positive effects on accuracy rather than reading speed. Indeed, these trainings target sequential rather than procedural processes. Thus, in the typical child, if the development of decoding accuracy occurs simultaneously with that of reading speed, and in accordance with the phonological recoding mechanism (or self-learning, Share, 1995), fluency in reading is only really acquired during the first 2 years of learning in primary classes. Reading fluency is therefore dependent on the level of exposure to reading, but also on the child’s motivation to read (Castles et al., 2018) and the opacity of the written language (Ziegler & Goswami 2005).

One of the hypotheses that could explain persistent reading speed deficits in children is a developmental imbalance between phonological and orthographic decoding procedures. This imbalance would impede a reciprocal “feeding” of these two identification procedures (Breznitz, 1997) leading to an over-reliance on semantic coding (see Cavalli et al., 2017, for evidence of compensation in adults with dyslexia based on semantic pathway) and/or phonological coding. This conception of a functional imbalance between phonological and orthographic coding in-turn led clinicians and researchers to propose interventions to reinforce the orthographic lexicon.

Some of the first developed clinical and pedagogical approaches in response to this were popularized by so-called repeated reading training (Vellutino et al., 1996; Tan & Nicholson, 1997). Initial descriptions of these exercises emphasized the need for systematic reinforcement through quickly reading the meaning of the repeated words or sentences, practicing therefore semantic coding as well. The effectiveness of this type of intervention has been debated (Meyer & Felton, 1999; Therrien, 2004) depending on whether the training involves words, sentences, or longer texts; whether the reading is silent or aloud; with or without control of reading errors; and at which age the intervention is proposed (Wexler et al., 2008). Most authors seem to agree that this type of intervention requires a prerequisite level of grapho-phonological decoding skills. Moreover, the gains noted are not very generalizable (Strickland et al., 2013). Other studies have highlighted the impact of repeated reading compared to phonological training (Lovett et al., 2000), when it is alternated with phonological awareness training (McArthur et al., 2015), or when it is associated with auditory masking and an accelerated reading condition, according to the child’s level of decoding speed (Breznitz, 1997, 2012).

The question of which remedial intervention on reading speed is preferred, at least initially, can be resolved systematically at least in regard to developmental retardations in grapho-phonological processing. Such an intervention would first directly target said processing, followed by orthographic processing. A consensus has been reached on the causal relationship between developmental retardations in grapho-phonological conversions and phonological deficits, as noted in a number of child studies (Menghini et al., 2010; Saksida et al., 2016; White et al., 2006) or dyslexic adults (Bruck, 1992; Martin et al., 2010; Ramus et al., 2003). However, some of these studies reveal, at a lower prevalence, a percentage of children with dyslexia, or adults, with a single visuo-attentional deficit or a double phonological and visuo-attentional deficit.

On the contrary, other researches (Bosse et al., 2007; Zoubrinetzky et al., 2014) using a test to assess visuo-attentional span (Evadys: Valdois et al., 2014) highlight an almost identical distribution of children with single (phonological and visuo-attentional) and mixed deficit profiles. In this respect, several studies (Bosse et al., 2007; Ziegler et al., 2010) bring to light a specific profile of phonological dyslexia, i.e., a dysfunction or non-automation of the graphical/phonic coupling can be associated with a deficient performance in visuo-attentional letter-perception tasks. According to this hypothesis, various studies have therefore proposed training sessions dealing with visuo-attentional deficits (e.g., Franceschini et al., 2013; Lorusso et al., 2005) or visuo-attentional span deficits (e.g., Zoubrinetzky et al. 2019), and notably demonstrated gains in both accuracy and speed of reading irregular words, but also pseudowords and texts.

Recent literature has provided convincing evidence that the impaired sensory or cognitive processes leading to reading disability are of a multifactorial nature (Pennington, 2006; Ramus & Ahissar, 2012; Ziegler et al., 2019). Numerous meta-analyses and literature reviews have also confirmed the beneficial impact of regular training in phonological awareness and coding on reading development (Ehri et al., 2001; Galuschka et al., 2014; Melby-Lervåg et al., 2012; Serniclaes et al., 2015; Suggate, 2016) and some studies underline the interest of taking into account the impact of specific phonological and visuo-attentional deficits (Ziegler et al., 2019; Zoubrinetzky et al., 2019). It therefore naturally follows that an effective, methodological approach to remediating dyslexia must target the underlying deficits and both phonological and orthographic coding procedures according to their degree of impairment, as proposed in dual-route models of reading. However as noted beforehand, previous remedial approaches mainly succeeded in improving reading accuracy and very little in reading speed (Shaywitz & Shaywitz, 2005). The objective of the current study is to assess the impact of a remedial intervention that may resolve this imbalance and hence also improve reading efficiency, in light of the parallel processing of multiple codings during written word identification, as in the connectionist view previously discussed.

The proposed remedial intervention we will test may be considered a multimodal approach: instead of it seeking to address a specific underlying deficit (phonological and/or visuo-attentional) associated with a dysfunction in an identification procedure (phonological and/or orthographic), it rather aims to balance the activity of these procedures. In these aims, we have taken up, in part, the experimental design of Breznitz (1997, for a review, see Breznitz 2012), involving an intensive training of repeated reading with a vocal music mask. The hypothesis for this approach’s efficacy was based on the results of two previous studies by Salamé and Baddeley (1987); Salame and Baddeley (1989), which demonstrated that inattentive listening to language or vocal music would disturb the operation of the phonological loop storage unit. This disturbance should therefore lead to less use of the grapho-phonological conversion procedure for reading and thus stimulate the use of a spelling procedure. The spelling route would be moreover facilitated by the context of repeated reading, in which word scrolling speed is adapted to each dyslexic child being trained.

Therefore in this work, we implement a remedial technique for dyslexia that is principally inspired from the repeated reading with vocal musical masking (RVM) condition from Breznitz (1997). We performed two studies to evaluate the efficacy of the proposed method, each with independent participant groups: (1) a pilot clinical study between a control and treatment (dyslexic) group to evaluate the instant effects and (2) a longitudinal study to examine more closely the children’s progress over a period 13 months. More specifically, the first study compares reading gains between different groups of dyslexic reading children who underwent training with auditory masking (RVM) vs. without auditory masking. The second study compares reading gains of dyslexic children who were followed in a typical, clinical hospital-university setting. This study evaluates if and how children may progress from having performed RVM training, even if they have already performed 8 months of standard remediation program (SRP) training. This second study is part of a quasi-experimental clinical approach to compare the effectiveness of reading training in accordance with the recommendations of Evidence-Based Practice research.

Notably in the longitudinal study, the following objectives were defined: (1) to evaluate the impact of an intervention program that specifically targets reading speed, as compared to a standard remediation program (SRP); (2) to determine the number of participants who individually benefited from this training, in taking into account their individualized reading profiles and underlying deficits; (3) to identify the factors that predict gains in reading efficiency; and (4) rate the positive attitude (or lack thereof) of each child about reading and writing before and after each type training.

Methods

Clinical pilot study

Prior to the longitudinal study, a pre-post-test clinical pilot study was conducted on an independent group of dyslexic children (diagnosed by a reference center for learning disabilities; CERTA; Paris Hospital University) in order to evaluate the validity of the proposed RVM intervention program. The selection criteria for the participants were the same as those applied in the longitudinal study (see the section on participants below), as well as the implementation approach of the intervention program, which consisted of a 5-week training period in which reading efficiency levels were measured before and after.

Specifically, a total of 66 participants with dyslexia were randomly assigned to two groups (control and treatment), the control group consisted of n = 29 (12 girls and 17 boys, mean age = 120.6 months; sd = 6.7) who followed the repeated reading without vocal masking intervention program (i.e., with no vocal masking), and the treatment group consisted of n = 37 dyslexics (15 girls and 22 boys, mean age = 118.9 months; sd = 7) who followed the repeated reading with vocal musical masking (RVM) intervention program.

Reading levels (measured before and after) were assessed by the reference standard in France, the Alouette leximetric test (Lefavrais, 1967; Lefavrais, 2005). The materials used in this pilot study (e.g., for the leximetric task, training paradigm implementation) match those used in the longitudinal study. Next, the remaining subsections of this “Methods” section detail the longitudinal study’s implementation, then in the “Results” section, the results for both the clinical pilot and longitudinal study are provided.

Longitudinal study

Participants

The 54 children (25 girls and 29 boys, between 9 and 12 years old) that participated in this study were previously diagnosed with dyslexia and received longitudinal follow-ups through the care of a university hospital unit (CERTA, i.e., Reference Centre for Learning Disabilities). As for the inclusion criteria of the study, the children with dyslexia had to show (1) a reading speed 18 months slower than typical readers of the same chronological age (Monzalvo et al., 2012; Sprenger-Charolles, 2019) on a leximetric test and (2) a non-pathological psychometric efficiency on the Wechsler Intelligence Scale for Children Fifth Edition (hereafter WISC-V, Wechsler, 2014). In this study, the full WISC-V has thus been administered to each participant before inclusion. Composite scores including Verbal Comprehension Index (VCI), Working Memory Index (WMI), Processing Speed Index (PSI), Fluid Reasoning Index (FRI), and Visual Spatial Index (VSI) are provided in Table 1. Moreover, an evaluation of reading and reading-related skills was carried out before and after the training (see Table 2).

Table 1 Results of the leximetric (“Alouette” reading test, Lefavrais, 1967) and psychometric (VCI, Verbal Comprehension Index; WMI Working Memory Index; PSI, Processing Speed Index; FRI, Fluid Reasoning Index; VSI, Visual Spatial Index) efficiency tests for children with dyslexia (N = 54)
Table 2 Test results pre-T1 and post-T2 phases

These children were all attending school normally and were previously undergoing speech and language therapy since their first school grade (mean 40 months (14) of therapy, as one 30-min session per week). Children were excluded from the study when (1) an agreement was not obtained from the regular speech therapist following the child to coordinate the training, (2) a suspension or irregularity in their daily training, and (3) a withdrawal of consent to participate in the study was submitted during or after the data collection. The present study was conducted in accordance with the Declaration of Helsinki. It was conducted with the understanding and the written consent of each child’s parent and in accordance with the ethical guidelines between the academic organization (Université Côte D’Azur) and educational organizations. Moreover, a declaration has been made to the French data protection Authority (CNIL, number 2163965v0) regarding the protection of the collected data and the participants’ anonymity.

Experimental design

The present study was conducted over a duration of 13 months, divided into 3 phases (see Fig. 1).

Fig. 1
figure 1

The three phases of the experiment and the time allotted in months: RVM (repeated reading with vocal music masking), SRP (standard remediation program)

The first phase (T0-T1) lasted 8 months and consisted of the child receiving a classic standard remediation program (SRP) once a week (30 min per session = approximately 14 hFootnote 1) from a speech and language therapist in private practice. The SRP intervention includes an alternation of reading, spelling, and phonological awareness exercises to stimulate grapho-phonological conversion processes and orthographic memory.

The second phase (T1-T2) lasted 2 months and consisted of the RVM remedial intervention. For each child, the intervention took place over 5 consecutive weeks, 6 days a week (15 min a day = 7 h 30). The RVM remedial intervention consisted of the child repeatedly reading aloud two texts while listening to a song in French, with headphones specially created for this study. The therapy was carried out under the supervision of the child’s speech therapist or one of his/her parents. On average, participants completed 54 (SD = 6) training sessions in the 5 weeks.

The third phase (T2-T3) lasted 3 months and consisted of a repeat of the standard remediation program (SRP) classical intervention without any RVM intervention.

Regarding invitation to participate in the study, following a consultation with the hospital unit, a waiver and description of the study (to participate in the RVM intervention) were presented to each recommended child and his/her parents. Once consent had been received from both child and parents, we contacted the speech therapist following the child. The therapist was provided with a website link that gave instructions and access to the RVM remedial intervention materials: texts of repeated readings, the songs for musical masking, and recording functionality for the reading times of each text. After the therapist then briefs with the university hospital unit, in period T1-T2, he/she sees the child again at his or her usual session time and suggests the first two texts that the child should read during the first week, and so on until the end of the program (a total of 10 texts over 5 weeks). For each new text (2 per week), comprehension was verified by a questionnaire (5 questions). If comprehension of either the text or any words thereof was flawed, the speech therapist clarified them to the child. After the reading, any of the child’s decoding errors were then noted by the therapist.

Materials

Evaluation: inventories, tests, and Likert scales

Reading level

The primary variable of interest measured in both the clinical pilot and longitudinal studies was reading level. Reading level was evaluated with the leximetric test, “l’Alouette ” (Lefavrais, 1967; Lefavrais, 2005), which is considered in France to be the “gold standard” instrument for assessing both children (Bertrand et al., 2010; Sprenger-Charolles, 2019) and adults (Cavalli et al., 2018). The Alouette test is systematically used by French practitioners and researchers to screen for dyslexia, as well as to assess reading level in general, from childhood to adulthood. The psychometric qualities of this test have been demonstrated in a number of previous studies in both children (Bertrand et al., 2010; Sprenger-Charolles, 2019) and adults (e.g., Cavalli et al., 2018) and, moreover, has been notably found to have high convergence validity (see Bertrand et al., 2010; Cavalli et al., 2018). In the Alouette test, the child is allotted 3 min to read a 265-word text passage aloud as quickly and accurately as possible. The text consists of real words in meaningless but grammatically and syntactically correct sentences, in order to limit the dyslexic reader’s access to contextual information (Rack et al., 1992; Nation & Snowling, 1998). Furthermore, the text is composed of five sections and is accompanied by drawings that promote contextual errors (e.g., a drawing of a squirrel [écureuil] close to the word écueil [pitfall]). The text includes rare words and some spelling traps: items with silent letters (temps/tã/, nids/ni/), contextual graphemes (gai/ɡɛ/, geai/ʒɛ/), and items that are phonologically similar (Annie/a.ni/, amie/a.mi/). The test also tracks contextual anticipation, which is characteristic of the youngest and least skilled readers (Perfetti et al. 1979; Stanovich, 1984). The text contains fixed expressions that are modified (“au clair de lune” instead of the usual “au clair de la lune”). It also contains words that are similar to those suggested by the context (e.g., poison [poison] rather than poisson [fish] after lac [lake]). The test thus prevents dyslexic readers and poor readers from compensating for their written word recognition difficulties by using contextual information (Rack et al., 1992). At the end of the test, an index of reading efficiency is then calculated by taking into account both time and accuracy, through the following equation: [CTL = [C (no. of words read correctly)/TL (child reading time)] × 180 s (maximum reading time)].

Reading fluency

Reading fluency (reading text with meaning) was assessed with an excerpt from Oscar Wilde’s short story “Le Géant Égoïste” [The Selfish Giant]. The text was homogenized in lexical frequency according to the Manulex database (Lété et al., 2004) and the psycholinguistic characteristics were matched with the training texts used for the RVM remedial intervention, i.e., total number of words (n = 350 ± 3), regular word digrams (n = 148 ± 10), trigrams (n = 6 ± 4), irregular words (n = 3 ± 3), and dialogs (0 to 3 sentences maximum).

Reading accuracy and reading-related skills

Reading accuracy and reading-related skills were evaluated with the Evalec-Primary computerized inventory (Sprenger-Charolles et al. 2010). The psychometric qualities of this battery, and particularly its specificity in evaluating word identification and metaphonological deficits, have been demonstrated in previous research (e.g., Sprenger-Charolles et al., 2005). For the reading portion of this inventory, which contains 2 subtests, all words are matched in length (number of letters), number of phonemes and syllables, and lexical frequency. The first subtest (48 items) consists of a list of irregular words and 3 regular word lists of 12 words each. The first list of regular words contains only simple graphemes (a letter corresponds to a phoneme), the second list contains words with a frequent digraph in French (ch, ou, on, etc.), and the last list contains only words with contextual graphemes (e.g., “ce”/“ce” vs. “ca”/“ka”). The second reading subtest (36 items) contains pseudowords matched to the regular words of the first subtest (12 simple pseudowords, 12 with log, 12 with contextual graphs). The special feature of this computerized test is that it measures only the recognition latency time for correctly read words using voice detection, i.e., the time between the moment the word appears on the screen and the onset of when it is read aloud.

The phonology portion of the Evalec-Primary inventory (i.e., metaphonology) is composed of 4 subtests. In this organization, it classically assesses syllabic and phonemic awareness skills with a syllabic segmentation subtest (deleting the first syllable of 10 Consonant-Vowel-Consonant, i.e., CVC, pseudowords: e.g., “Povidu”) and two phonemic segmentation subtests (deleting the first sound of 12 CVC monosyllabic pseudowords: e.g., “zak” or 12 CCV monosyllabic pseudowords: e.g., “pluf”). It is then completed with a phonological short-term memory subtest (repetition of 24 pseudowords of 3 to 6 syllables; 6 items per category: e.g., “sogute,” “munigamessotir”).

Visuo-attentional skills

In respect to visuo-attentional span skills, these were assessed with the Evadys computerized inventory (Valdois et al., 2014). The first subtest, known as “Global Report,” consists of trials in which one reiterates a sequence of 5 letters, randomly chosen by 10 consonants (B, P, T, F, L, M, D, S, R, H) immediately after the sequence disappears from the screen (200 ms presentation time). The second subtest, known as “Partial Report,” consists of trials in which a vertical bar appears along with the sequence of 5 letters, indicating the position of the single letter to be named. A percentage score of successfully identified letters between Global Report (100 letters presented/20 presentations) and Partial Report (50 indexed letters/50 presentations) is then calculated. The overall measure from this test is a composite span score, corresponding to the average success rate in Global and Partial Report conditions. Finally, note that these phonological and visuo-attentional related skills were measured in phases T1 and T2 of the experiment.

Phonemic fluency

Both oral and written tests of phonemic fluency were used to assess executive functioning skills and access to the orthographic lexicon (Frith et al., 1994; Booth et al., 2010; Varvara et al., 2014). Phonemic fluency tasks consist of the child producing words beginning with the sound “P” and “M” (as many as possible), in the time limit of 1 min in the oral modality, and 2 min in the written modality.

Attitude about reading and writing

A Likert Scale of 10 questions was given to the child before, as well as after the RVM remedial intervention, in order to evaluate the child’s attitude about reading and writing (level of positivity or negativity).

Training texts for the RVM remedial intervention

In compliance with the official database and reading level recommendations of the French Ministry of Education, 10 training texts were composed, and homogenized in lexical frequency according to the Manulex database (Lété et al., 2004). With regard to homogenizing the texts linguistically, the following amounts were controlled: the total number of words (n = 350 ± 3), regular word digrams (n = 148 ± 10), trigrams (n = 6 ± 4), irregular words (n = 3 ± 3), and dialogs (0 to 3 sentences maximum). The font used in the texts was Calibri size 12 with a line spacing of 1.5. Each text was accompanied with a five-item multiple choice questionnaire to assess reading comprehension.

Online platform and auditory masking

The auditory masking program was made accessible on an online platform that enabled the therapist to automatically play and loop the song (auditory mask) while the text was being read, facilitating the standardization of the RVM therapy. Two songs from popular French music were chosen and played in alternation as masks.

Results

Clinical pilot study results

A repeated-measures ANOVA was conducted on reading efficiency (Alouette scores) with time of evaluation (pre-test and post-test) as a within-subject factor and treatment group (dyslexic RVM and dyslexic control SRP) as a between-subject factor (see Fig. 2). The results yielded a main effect of time (F(1;64) = 40.1; p < .001; η2 = 0.38); the effect of group was non-significant (F(1;64) = 0.1; p = .97), but the group-by-time interaction was significant (F(1;64) = 6.2; p < .01). The post-hoc analyses indicated that while there was no significant difference between both groups on pre- and post-test (respectively; t(64) = 0.9; p = .79 and t(64) = 1.2; p = .45), the difference in reading scores pre- and post-test within the dyslexic RVM group was significant (t(36) = 6.6; p < .01) and non-significant within the dyslexic control group (t(28) = 1.8; p = .10).

Fig. 2
figure 2

Average reading scores (and standard deviation) as a function of group (dyslexic RVM and dyslexic control SRP)

Longitudinal study results

A repeated-measures ANOVA was conducted on reading efficiency (Alouette scores) with time of evaluation (T0, T1, T2, T3) as a within-subject factor (see Fig. 3). The results yielded a main effect of time (F(3;159) = 71.4; p < .001; η2 = 0.57). We then conducted a set of pairwise comparisons, correcting the level of significance of each test using the false discovery rate, a practical and powerful approach to multiple testing developed by Benjamini and Hochberg (1995). The comparisons indicated a significant effect in the T0-T1 comparison (t(53) = − 4.82; p < .001; less than the BH-corrected threshold of q = 0.008), a significant effect of T0-T2 comparison (t(53) = − 9.74; p < .001; q = 0.016), a significant effect of T0-T3 comparison (t(53) = -9.88; p < .001; q = 0.025), a significant effect of the T1-T2 comparison (t(53) = − 10.11; p < .001; q = 0.033), and a significant effect of the T1-T3 comparison (t(53) = − 7.80; p < .001; less than the BH-corrected threshold q = 0.041). Finally, there was a non-significant difference between T2 and T3 (t(53) = 1.07; p = 1).

Fig. 3
figure 3

Development of reading efficiency (CTL score on the Alouette test) according to phases T0, T1, T2, and T3. The gray line represents the observed scores/gains, and the dotted line represents the estimated scores/gains. T0 (mean = 125.5; sd = 49.2; 95% confidence interval, i.e., CI [112.3; 138.6]; T1 (mean = 148.6; sd = 59.8; 95% confidence interval, i.e., CI [132.7; 164.6]; T2 (mean = 187.6; sd = 65.6; 95% CI [170.1; 205.1]; T2’ (mean = 153.4; sd = 61.4; 95% CI [137.2; 170.1]; T3 (mean = 184.2; sd = 58.6; 95% CI [168.5; 199.8]

A post-hoc analysis was conducted to estimate the impact of the SRP program on the RVM intervention results, using the reading score obtained by each participant at T0 and T1 as a baseline. In this way, a predicted score was computed for each participant so that the observed evolution between T0 and T2 without the RVM remedial intervention is corrected for (T2’; see Fig. 3). In a generalized linear framework, this prediction allowed us to correct for the effect observed between the two types of interventions. Interestingly, the positive effect of the RVM intervention survived even when corrected for/penalized by the mean at T1 (t(53) = 8.56; p < .001; Cohen’s d = 1.16).

To better understand the effect of the RVM remedial intervention, in addition to reading efficiency scores, other measures were analyzed pre-T1 and post-T2 phases (i.e., global evaluation): reading a text with meaning, reading regular, irregular and pseudowords, metaphonology, phonological short-term memory (an efficiency score was calculated for the 4 subtests), visuo-attentional span, and phonemic fluency in oral and written modalities. The results (see Table 3) demonstrate significantly higher scores after the RVM remedial intervention (Cohen’s d [0.4; 1.1]) for all tests.

Table 3 Correlations between the measured variables

Identifying participants who benefit from the RVM remedial intervention

We then sought to identify the proportion of children who significantly benefited from the RVM remedial intervention. Based on the predicted scores in T2’, we applied Crawford's single case study methodology (Crawford et al., 2010) to determine the cutoff threshold that would determine a significant gain in training. As per an alpha of 0.05, the minimum reading efficiency gain needed was an increase of 21 points. Based on this threshold, 46 participants (85%) met the criteria for a significant improvement, while 8 participants did not. Table 3 provides the correlations between reading gain (as the difference between T2 and T1, hereafter ΔT2-T1) and other scores obtained on reading and reading-related tasks at T1. Significant correlations were found between the ΔT2-T1 gain score and the T1 scores on word reading, metaphonology, written fluency, and processing speed index (PSI; WISC V) performances.

These correlational analyses motivated a hierarchical regression modeling in order to better explain the ΔT2-T1 reading efficiency gain variable. Two models were found. In the first model, PSI was found to be a significant predictor of reading gain scores (β = .58, SE = .197, F = 8.73, p = .005, adjusted R2 = 0.16). Thus, higher scores in PSI were associated with better reading gain scores. In the second model, PSI and written phonemic fluency were found to be significant predictors of reading gain scores, respectively (F = 8.43, adjusted R2 = 0.29; PSI: β = .57, SE = .181, p = .003; written fluency: β = 2.72, SE = .936, p = .006). Thus, higher scores in both PSI and written phonemic fluency were associated with better reading gain scores.

Finally, in regard to the questionnaire assessing positive attitude toward reading and writing, results showed that ratings at T2 were significantly more positive than ratings at T1 (respectively; mean T1 = 30.3, SD = 3.5; mean T2 = 33.1, SD = 2.9; t(53) = − 4.53; p < .001; Cohen d = − 0.61).

Discussion

In this work, two different studies were examined to evaluate the effectiveness of a novel intervention program, RVM, for the improvement of reading ability in children with dyslexia. First, a pilot clinical study (e.g., between control—reading without masking, and treatment groups—reading with masking, e.g., RVM) was used to test for the presence of immediate gains in reading fluency that can be observed with RVM training. Then, a longitudinal study was crucially used to examine more closely the dynamics of children’s reading fluency over a period of 13 months (for example, to what extent reading gains from RVM would be retained over time), as well as to consider a number of other highly relevant covariables. Note that for both studies, reading efficiency was measured by the Alouette test, which is the test of reference, or French “gold standard,” for assessing reading efficiency in children.

The results of both studies supported that RVM training had a significant efficacy as compared to standard remediation program (SRP) training. Moreover, these results are in line with previous literature that also found improved reading performance variables during music masking (first proposed by Breznitz, 1997) rather than without (e.g., Strickland et al., 2013). As the longitudinal study provides more information into the dynamics of reading fluency gains and losses over a realistic clinical period containing RVM training as well as standard training, prior and after the intervention (as well as provides a number of other covariables measured worth greater explanation), we concentrate in more detail the discussion on these results.

RVM vs. SRP training

The longitudinal study (possessing four phases: T0 to T3) provided an opportunity to compare rates of gains, as well as gain retention, of reading fluency for RVM vs. SRP training. In summary, clear reading efficiency gains were observed during, as well as after, the RVM training period that was applied between phases T1 and T2. With regard to gain retention, reading efficiency gains appeared to stabilize at 3 months post-training (phase T3).

Prior to that, baseline improvement in reading efficiency with the standard intervention program, SRP, was measured over a period of 8 months (phases T0 to T1). Based on these results, we then predicted a reading efficiency score, T2’, that simulated continued improvement with SRP up to T2 (hence only continued SRP intervention and no RVM). The statistical analyses showed that, even when the reading efficiency scores were corrected for/or penalized by these two additional months of SRP improvement, significant gains with the RVM intervention over SRP are still demonstrated.

The sustained improvement in reading scores we herein observed from RVM intervention can be viewed as evidence in favor of a beneficial reorganization of reading procedures. Moreover, children reported more positive attitudes about reading and writing (and even their testing) after the RVM intervention as compared to after SRP. Finally, modeling and thresholding analyses showed that while 8 out of the 54 children did not significantly respond to the intervention, the majority of children (85%) improved their reading scores after the RVM intervention and that processing speed and written phonemic fluency are good predictors of whether the intervention will be effective for any given child.

This study is in service of a growing clinical movement in favor of evidence-based remedial interventions. In this approach, training and care decisions are chosen based on the principles of Evidence-Based Practice (EBP; Sackett et al., 1996) where convincing scientific data, the clinical expertise of practicing therapists, and the expectations of patients suffering from the given disorder, e.g., developmental dyslexia, are considered in combination. In line with this approach, meta-analyses assessing intervention effectiveness (Ehri et al., 2001; Galuschka et al., 2014; Suggate, 2016) recommend taking into account different patient variables such as reading and crucially related skills, lexical age, cognitive skills, and motivation; since they found these may impact a patient’s success in responding to a given intervention. Moreover, these meta-analyses suggest improved benefits in reading fluency when interventions are more regular and frequent (e.g., daily) and shorter in duration, such as in our proposed RVM therapy approach.

In our study, and in accordance with the literature (e.g., Ziegler et al., 2019; Menghini et al., 2010; Saksida et al., 2016; White et al., 2006; Zoubrinetzky et al., 2014), all of the children with dyslexia presented a mixed deficit profile, specifically with deficits in both phonological and orthographic coding procedures (Sprenger-Charolles et al. 2011). In aims to better understand the factors determining a successful response to the RVM intervention, a regression modeling demonstrated the WISC-V processing speed index and written phonemic fluency levels of the child to be determiners. With closer interpretation, verbal fluency tasks provide a measure of an individual’s ability to gain, both controlled and flexible, access to information in long-term memory (Fisk & Sharp, 2004). Such communication with long-term memory notably implicates the crucial role of executive functioning in the task, and this claim is further supported by the phonemic fluency result from the model, which has been argued to call more heavily upon executive functions than semantic fluency (e.g., Ardila et al. 2006). Few previous studies, to our knowledge, have evaluated verbal and written phonemic fluency skills in dyslexic children. In the oral modality, their data generally show a deficit in phonemic fluency (Goswami, 2000; Snowling, 2000) and confirm them having better semantic than phonemic fluency (Weckerly et al., 2001). To improve upon phonemic deficits, a maturation of executive, strategic components such as working memory, self-monitoring, and flexible thinking has been argued for (Troyer et al., 1997). These modeling predictors (WISC-V processing speed and phonemic fluency) hence support the hypothesis that a rapid access to words and strategic search through lexical/phonological memory is used in RVM (Baldo & Dronkers, 2006) and hence can be important selection criteria for RVM intervention.

It is important to note that the reading, and related-skill, profiles which classically justify a certain training recommendation (in order to stimulate and reinforce phonological and spelling coding procedures) are often not related to the observed gains post-intervention (Zoubrinetzky et al., 2014). Given the diversity of profiles observed in the study herein, we aimed to assess each child’s progress in accordance with the baseline principle (Casalis et al., 2019; Seguin, 2018). Baselines used in the clinic make it possible to validate the effectiveness of an intervention, repeatedly and longitudinally. In line with evidence-based practices, as discussed previously, the choice of a “predicted score” calculation made it possible to compare the effects of two training sessions on the same child longitudinally. While this choice of a “predicted score” calculation remains debatable, and cannot replace the strength of evidence of a control group, the capacity to analyze reading gains over different time periods (slopes, or rates) may provide valuable insights.

Beginning with the SRP intervention between phases T0 to T1, a significant improvement of reading gains was observed with a low amplitude slope. The predicted score, T2’, hence fully took into account this continued improvement. Then, the RVM intervention between phases T1 to T2 also resulted in a significant improvement, but with a notably high amplitude slope, which may arguably be sufficient in itself to justify a training effect. The predicted score, T2’, was calculated in the interest of discussing expected gains between RVM and SRP, and the analyses still indicated RVM as a significant improvement. However, even more important than the magnitude of these reading gains observed between T1 (SRP) and T2 (RVM), it is the maintenance of these improvements with RVM training that would truly confirm its effectiveness as a remedial intervention.

In this respect, upon analysis of phases T2 to T3 where SRP intervention was resumed, the improved reading efficiency gained from the previous RVM training in fact appeared to stabilize, though with a negative low amplitude slope, or slight loss. Future studies may consider lengthening this post-assessment period to validate where the positive effects may ultimately stabilize or plateau. Moreover, as the cumulative training time did not explain the gain differences between SRP (16 h) and RVM (7.5 h) interventions, future studies may consider adjusting these parameters and observing the differences. Indeed, the SRP intervention took place over a longer time period with weekly training sessions (8 months, approximately once per week/30 min per session), whereas the RVM intervention took place over a shorter time period with daily training sessions (5 weeks, 6 times a week/15 min per session). These parameters importantly merit to be further explored in future studies.

Repeated reading with or without vocal masking

Consistent with the results obtained by Breznitz (1997, 2012) and the literature on the effectiveness of repeated reading interventions (Therrien, 2004; Strickland et al., 2013), through our results, we also maintain the hypothesis that this type of training is more effective when combined with auditory masking.

For example, in the clinical pilot study included herein, the first dyslexic group of randomly assigned participants (n = 37) received the RVM intervention program with music masking, and the second dyslexic group (n = 29) received this intervention without auditory masking. A repeated measures ANOVA on reading efficiency (CTL – Alouette) scores indeed demonstrated that only the dyslexic children who received the intervention with auditory masking observed significant gains in reading efficiency in comparison to their test scores prior to the intervention.

For clinical effectiveness, it is also important to take into account individual differences in response to therapy. It was found that, for the RVM group, 8 out of 37 children (~ 30%) had slightly lower scores in fluency gains from the training. In contrast, 3 out of 29 children (~ 9%) in the non-auditory masking group had slightly lower scores fluency gains, yet all other children (~ 91%) observed notably lower fluency gains than in the RVM group. This clinical pilot study was instrumental for the evaluation of immediate fluency gains that could be gained from RVM training and, moreover, in the design of the longitudinal study, which, in turn, allowed us to examine gain retention and dynamics over time for RVM vs. standard remediation program (SRP) training.

In favor of a beneficial reorganization of reading procedures (Breznitz, 1997)

The originality of Breznitz’s (1997) founding study was to associate several experimental conditions with sentence reading, particularly, rapid reading with or without the association of vocal musical masking. Her objective was to assess an improvement in reading speed, but also in written comprehension, by asking questions about the meaning of the presented sentences at the end of the reading. The results indicated better reading performance in children with dyslexia when the training combined the rapid reading with vocal music masking, suggesting that the auditory masking allowed the phonological pathway to be “saturated.” Her data provided evidence in support of the hypothesis that persistent reading speed deficits may likely be a persistence to overly activate grapho-phonological conversion procedures than due to deficits in phonological and orthographic coding procedures alone. Her approach, and therefore the one of our study, is original in the sense that it does not seek to directly reinforce underlying component skills, but to facilitate access to orthographic representations. Moreover, theoretical support for vocal musical masking’s effectiveness is further provided by Baddeley’s working memory model (for a review; 1990), which postulates that inattentive listening has an effect on phonological storage capacities, in turn favoring a phonological loop. Indeed, studies conducted on adults report a selectively beneficial effect of language or music (regardless of vocal component) on phonological storage capacities, but not an effect of random noise (Salamé & Baddeley, 1987, 1989).

Research has shown that written word identification skills are dependent on the proper functioning of the phonological loop, particularly phonological storage (e.g., Snowling & Hulme, 1989). The issue of phonological storage dysfunction in dyslexic children is raised in most studies by a deficit in working memory tasks (e.g., Gathercole & Baddeley, 1993; Majerus & Boukebza, 2013). These data are consistent with the causal relationship between developmental delay in the grapho-phonological conversion procedure and phonological deficits present in dyslexic children (Menghini et al., 2010; Saksida et al., 2016; White et al., 2006). Some authors (Swanson & Alexander, 1997) postulate that in the dyslexic child, this storage deficit may only be encapsulated and isolated at the phonological loop or, on the contrary, may lead to a more general dysfunction of working memory. Other authors (Majerus & Cowan, 2016) stress the need to distinguish between the “item” aspect of the information to be memorized in the short term, which relates to phonological and semantic characteristics, and the “serial” aspect, which relates to the order in which this information is presented. The “item” aspect of the information would therefore be related to the proper functioning of the phonological loop, whereas the “serial” aspect would depend on the quality of the executive functions. This serial aspect is also observed in visuo-spatial short-term memory tasks, which would imply that a short-term memory deficit is not only the consequence of underlying phonological deficits (Hachmann et al., 2014; Romani et al., 2015).

Therefore, to address this overreliance on grapho-phonological conversion procedures, the music masking during reading technique is aimed to lead to a disruption in the phonological storage that is necessary to carry out such conversion procedures. In consequence, the reader has to rely on other word recognition procedures, such as activating more specifically the orthographic representations, relying on meaning, and recruiting executive skills for processing speed.

In our study, the repeated reading of the same texts (6 times a week) during RVM training was designed to facilitate the use of orthographic coding and reinforce the semantic coding procedures implicitly used to compensate for written word recognition deficits. The auditory masking of RVM was aimed to disrupt phonological loop storage, limiting activation of grapho-phonological conversion procedures, thus decreasing the generalization of the phonological loop dysfunction to the whole working memory system (Swanson & Ashbaker, 2000). A number of significant reading efficiency improvements were observed in children with dyslexia, which are compatible with recent findings. Notably, the data are in favor that children with dyslexia applied improved executive skills in reading and reading comprehension (e.g., Sesma et al., 2009, and for a review, see Booth et al., 2010) and increased their reading speed (Swanson & Alexander, 1997; Swanson & Jerman, 2007), although their performance remained generally more impaired than that of normal readers (for a review, see Kudo et al. 2015). Moreover and interestingly, such gains in executive functioning may promote metacognition, an important component of inhibitory control and motivation (Sonuga-Barke, 2003).

Retest effects

It is also worthwhile to discuss the issue of possible retest effects following repeated use of the same leximetric test (the Alouette reading test) throughout the study phases (T0/T1/T2/T3), in order to assess reading gains over time. As noted previously, the Alouette is considered the “gold standard” in France for screening both children (e.g., Bertrand et al., 2010) and adults (Cavalli et al., 2018) for dyslexia. A central design of this test is that the text is meaningless, while being syntactically and grammatically correct. This is done in order to limit the dyslexic reader’s access to contextual information (Nation & Snowling, 1998; Rack et al., 1992), or reading strategies based on semantic skills, frequently used to compensate for orthographic- and phonological-processing deficits. In having this design, the test has been shown to be psychometrically valid (both sensitive and specific) to screen for dyslexia in adults; even on a specific population of high-functioning university students with dyslexia, who had developed compensatory strategies (Cavalli et al., 2018). Moreover, as we observed a lack of improvement of reading efficiency scores during phase T3 (a 3-month period), this is evidence in favor of an absence, or weak, if any, of retest effects in the Alouette, thereby in favor of its reliability as a test.

Conclusion

In summary, the remedial approach tested herein was aimed at rebalancing the different coding levels involved in written word recognition and improving executive skills, rather than directly reinforcing specific, deficient coding skills. In line with this hypothesis, the results demonstrated an index of overall processing speed, and phonemic fluency, to be the best predictors of successfully responding to the intervention. The hypothesis of a rebalancing of coding procedures, in accordance with standard connectionist theory, is also consistent with the reading gains observed, extending to the gains observed in phonological and visuo-attentional skills.

A number of questions remain about why the RVM intervention may provide such a positive effect. For example, in a future study we would like to differentiate to what degree, if any, the positive effects of RVM training may be explained simply by positive attentional reinforcement effects. For example, a musical reinforcer that was less well-received by the children could possibly have had the opposite effect on phonological storage. One could also consider other psychometric tests than those used in the study (e.g., in Tables 2 and 3) to better assess the gains and deficits associated with the dyslexic profiles. For example, although the initial hypothesis is that dyslexia stems from a phonological deficit (Norton et al., 2014), subsequent works postulate that problems in written-word recognition cannot be related to a single deficit (Ramus & Ahissar, 2012), while multiple deficit theories propose a multi-factor causal model (e.g., Pennington, 2006), in which a number of sensory or cognitive processes are altered to varying degrees.

In the context of existing clinical interventions, the RVM intervention program provides an innovative framework that has shown to be effective for remediating reading deficiencies in children with dyslexia. The main advantage in its proposed form, is that it is short, intensive, and targeted, making it both attractive and viable for clinical settings. This proposed intervention could contribute at least one answer to the lack of many rehabilitation approaches that are just not validated (Casalis et al., 2019; Fitz-Gibbon & Morris, 1996; Seguin, 2018); moreover, the present work is in line with a global movement for evidence-based practices in therapy. Future studies could crucially assess the cumulative effect of several intermittent RVM interventions introduced over several years, making it possible to explore the curative nature (or degree of) of the program.