Introduction

It is generally agreed that the typical bilingual is confronted with a variety of situations that might regularly exercise aspects of attention. Consequently, a question of great interest has been: Are any aspects of attention improved by this exercise? Different aspects of attention were described in a taxonomy of attention proposed by Posner and Petersen (1990). Inspired by this taxonomy, the attention network test (ANT) was developed by Fan et al. (2002) to measure the efficacy of three isolable networks of attention: alerting (achieving and maintaining a state of readiness to respond), orienting (giving processing preference to some sensory inputs over others) and executive control (resolving conflict between activated response tendencies). Albert Costa, a renowned bilingualism researcher, was the first scientist to exploit the ANT to address this topic of great interest (Costa et al. 2008, 2009). Our purpose in this paper is to highlight Costa’s rationale for using the ANT, describe his findings about the effects of bilingualism on the network scores generated by this test, and to provide a meta-analysis of the many studies by researchers who followed his lead in using this test to try to answer the aforementioned question.

Costa et al. (2008)

This seminal paperFootnote 1 was titled: “Bilingualism aids conflict resolution: Evidence from the ANT task.” After noting that: “When producing and comprehending language bilinguals need to ensure that the correct lexical representations are accessed,” Costa et al (2008) imply that this must be solved by control mechanisms that prevent “massive interference” from the irrelevant language, and they pose the question that their paper addresses: “does the continuous use of such a control mechanism have an impact on other general-purpose attentional mechanisms?” They later introduce Posner and Petersen’s taxonomy of attention, and propose:

“To explore the impact of bilingualism on the attentional abilities of young individuals at the peak of their attentional capabilities, we asked monolingual and bilingual speakers to perform the attentional network task (ANT) developed by Fan, McCandliss, Sommer, Raz, and Posner (2002).” (p. 63) "This task is especially appropriate to assess potential differences between monolinguals and bilinguals, since it relies minimally on linguistic and memory processes that may interact with bilingualism.” (p. 65).

To answer the executive control question, they explored both the flanker effect and also switching (by looking at sequence effects) from young adults. Partly because so few scholars who have applied the ANT to this question have explored sequence effects, our focus here will be confined to overall performance and network scores. Their rationale to test young adults is clearly stated:

“Given these observations, one should expect bilingualism to affect the functioning of the attentional processes across the life-span, even at those ages at which individuals are at the peak of their attentional capabilities. However, the research conducted with younger adults has not led to such strong results…Thus, at present, more evidence is needed to unequivocally show that there is a behavioural difference between bilinguals and monolinguals when they are at the peak of their attentional abilities.” (p. 62).

Before we describe the findings from this paper, we want to highlight, in no particular order, some of its many positive features. In this literature Costa et al. (2008) is noteworthy for testing a relatively large number of participants (200 in total, more than 3 times as many as had been used in any of the studies reviewed by Hilchey and Klein (2011; see Table 2) 3 years later. The nature of their monolingual and bilingual groups is highlighted as being less subject to a variety of “hidden factors” that might affect performance, such as socioeconomic status, social development (Carlson 2009) and the influence of culture (Samuel et al. 2018):

“Bilingual participants were living in a bilingual society (Barcelona, Catalonia), while monolingual speakers were living in a monolingual one (Tenerife, Canarian Islands). Thus, their linguistic status (bilingual vs. monolingual) reveals the linguistic environment in which the individuals live and not other factors such as intelligence, motivational factors, socioeconomic status, education, etc. That is, the bilinguals have achieved such a status not because they are more motivated or more intelligent than the monolinguals, but rather because they had been continuously exposed to two languages.” (Costa et al. 2008, p. 68).

As such, the bilingual participants in this paper were not subject to many of the “hidden factors” (Bialystok 2001, p. 7) that, like bilingualism, might affect cognitive development (for a general discussion, see Hilchey and Klein 2011). This article also deserves praise for its complete presentation of the ANT data; that is reaction time (RT) and accuracy (error rates) were broken down by group, congruency, and cue condition.

Costa et al. (2008) summarize their findings as follows:

“Although the pattern of results for the bilingual and the monolingual groups are qualitatively similar, there are important quantitative differences between the two groups. The four most relevant differences are the following:

  1. (a)

    Bilinguals were faster than monolinguals irrespective of whether the trial was congruent or incongruent.

  2. (b)

    Bilinguals suffered less interference from incongruent flankers than monolinguals.

  3. (c)

    Bilinguals suffered less switching cost (especially for congruent trials) than monolinguals.

  4. (d)

    Bilinguals took more advantage of the alerting cue than monolinguals.” (p. 77).

Our meta-analysis will explore the extent to which a, b and d have stood up to further examination.Footnote 2

Costa et al. 2009

Whereas Costa et al. (2008) reported that congruency interacted significantly with language group (finding (b) from above) they also reported a 3-way interaction involving block: the bilingual advantage in resolving flanker conflict was present in blocks 1 and 2 of the ANT, but not in block 3. This finding is particularly pertinent for the paper, entitled: “On the bilingual advantage in conflict processing: Now you see it, now you don’t,” that Costa and his colleagues (Costa et al. 2009) published in the following year. As we will see, whether it was intended or not, the subtitle (Now you see it, now you don’t) aptly reflects the inconsistent evidence for a bilingual advantage.

Similar to the 2008 paper, this one also used the ANT—though it was modified by dropping the neutral flanker condition and varying the proportion of congruent (and conversely incongruent) trials. The focus here was on how a monitoring advantage might mediate an advantage in conflict processing. The rationale for the two experiments reported in this study was clearly outlined:

“In this study we test the hypothesis that the bilingual advantage in overall RTs stems from a more efficient monitoring system. The rationale of the study is the following: if the bilingual advantage is in some way related to the functioning of the monitoring system, then it should be present in conditions requiring high monitoring demands, and reduced or absent in those conditions in which the monitoring system is less taxed.” (p. 136).

In the first experiment, with low monitoring demands (because of the relatively low frequency of switches in congruency) half the participants experienced 8% congruent trials while the other half experienced 92% congruent trials. In total there were 120 participants in this experiment (60 monolinguals and 60 bilinguals). The findings were in striking contrast with the 2008 paper: There were no significant effects of, or interactions with, language group. Costa et al. (2009) identify the source of the difference: “The most important difference between the two studies is the distribution of congruent and incongruent trials.” 33% of the trials were conflicting in the 2008 paper whereas in this experiment either 8% or 92% of the trials were conflicting.

The second experiment, with high monitoring demands, was designed to confirm this source. In this experiment, half the participants experienced 50% congruent trials or while the other half experiences 75% congruent trials. In total there were 124 participants in this experiment (62 monolinguals and 62 bilinguals). Whereas some findings from the 2008 paper were supported in some conditions of this experiment, some were not. The importance of practice on the task was reinforced in the 75% congruent condition: Bilinguals showed both an RT and flanker interference advantage but only in the first block. The co-occurrence of these two bilingual advantages, and the disappearance of both with practice, might lead one to suspect that they are mediated by the same underlying cognitive processes. The results from the 50% congruent condition, however, provide no support for such a linkage. Here there was a bilingual advantage in overall RT in all three blocks and no hint of a flanker interference advantage in any block.

Because of the emphasis on monitoring and conflict, presentation of the alerting and orienting network findings was relegated to “Appendix”. Here we learn that there were no significant effects of language group upon these network scores, but no actual scores are reported. For this reason, the 2009 paper was not included in the main meta-analyses reported next but, for the interested reader, a summary of the findings from these two papers is presented in Table 1.

Table 1 Summary of the results from Costa’s two papers using the ANT to explore attention in monolinguals (M) and bilinguals (B)

Meta-analysis

Costa’s seminal studies and the great interest in the question about whether bilingualism improves any aspects of attention inspired many researchers to explore the possible modification(s) of attention by bilingual exercise using the ANT. In the Attention Network Database created by the present authors, we found 16 such papers. The purpose of this paper is to let the reader know what this literature, stimulated by Costa, reveals about this question.

Methods

In order to evaluate the existing literature that has explored attentional differences in bilingualism using the ANT, we used the attention network test (ANT) Database (Arora et al. 2020). The ANT Database is a repository of all academic publications that have cited the original Fan et al. (2002) paper and used the ANT or any of its variants in an experiment. The search term “bilingual” populated a list of 40 publications that had used the ANT with a bilingual population. This list was further condensed to only include studies that reported all three network scores and that had both a bilingual and monolingual population. In order to account for developmental variabilities studies were divided into three different age groups for analysis (Table 2). Participants in these experiments resided in a number of international countries and spoke many different L1 and L2 languages.

Table 2 Demographics for all studies in the three meta-analyses organized by age group

The bilingualism comparison in child populations was conducted on 6 publications that used an ANT. Participants in these studies ranged from ages 4–17 years old (weighted mean age and standard error (SE) of monolinguals and bilinguals = 11.05 ± 0.08 and 10.88 ± 0.07). Two of these studies used the original ANT, and the remaining used the child ANT,Footnote 3 developed by Rueda et al. (2004). A total of nine publications were included in the young adult analysis. These consisted of studies that had participants ranging between 17 and 55 years old (weighted mean age and SE of monolinguals and bilinguals = 21.7 ± 0.25 and 22 ± 0.17) and used the original ANT, a variant of the original ANT with minor modifications to the cue presentation interval (see Sabourin and Vīnerte 2019), or the lateralized-ANT (Greene et al. 2008).

Lastly, two studies explored bilingual differences in middle-aged adults ranging between the ages of 47–60 years old. Both studies used the original ANT and weighted mean age and SE between the two studies was 48.92 ± 0.69 for the monolingual group and 49.58 ± 0.64 for the bilinguals.

Results

Data for all three populations were analyzed through Bayesian hierarchical modeling using the Stan package in R (Carpenter et al. 2017). Weakly informed priors were used, and posterior samples were obtained across six independent chains, each consisting of 10,000 warmup and 10,000 post-warmup iterations. All models passed the diagnostic checks provided by the rstan package (Stan Development Team 2019) for R. When, for a given study, the mean standard deviation (SD) for RT or network scores were not reported or could not be derived, what was entered for that study was the mean SD from the other studies in that group. Our hierarchical model provided the opportunity to quantifiably assess heterogeneity between each study as well as in each of the individual measures between groups (see “Appendix”).

Forest plots for the studies included in the child analysis are reported in Fig. 1. Each data point represents the mean network score or mean RT, and the respective size of the symbol is relative to the number of participants. Error bars indicate the SD of each mean, which were reported in all but two studies for network scores. Mean RTs were extracted from summary tables and corresponding SDs were only reported in three studies.

Fig. 1
figure 1

Forest plots of mean RT and network scores of the bilingual and monolingual groups with child participants (grouped vertically). Relative size of data points reflects the N and error bars represent ± 1 SD. Studies are entered in alphabetical order from bottom to top

Because each study had both a monolingual and bilingual group, the intercept of the means for the three network scores and mean RTs between each group could be subjected to comparative modelling. This was facilitated by using standard errors to compute respective measurement noise through a partially-pooled distribution model. These mean intercepts, calculated from across group averages for each measure are presented in Fig. 2a. The credibility intervals (CrI) of the posterior distribution of differences between bilingual and monolingual groups on measures of mean RT (median = − 46 ms, [− 136, 37]), alerting (3 ms, [− 10, 17]), orienting (− 2 ms, [− 15, 7]), and executive functioning (− 2 ms [− 15, 6]) are reported in Fig. 2b. These results suggest the differences between these two groups are credibly zero in all measures.

Fig. 2
figure 2

Violin plots of the posterior distributions of the a intercepts and b differences between child bilingual—monolingual groups. Black dots represent the posterior median, thick white bands reflect 50% credibility interval (CrI), and thin white bands reflect 95% CrI

The same methods of analyses were applied to the young adult population using the 9 ANT studies listed in Table 2. Forest plots of the mean RT and average network scores for each group in each study are presented in Fig. 3. Figure 4a presents the distribution of the across group intercepts for the three network scores and mean RTs. The posterior distribution of CrIs for between group differences on measures of mean RT (median = − 27 ms, [− 51, 3]), alerting (4 ms, [− 2, 9]), orienting (1 ms, [− 5, 8]), and executive functioning (− 14 ms [− 25, − 4]) are shown in Fig. 4b. As in the previous model, these results suggest the value of differences between these two groups is credibly zero for mean RT, the alerting, and the orienting network. However, with young adults differences in the executive network are credibly non-zero.

Fig. 3
figure 3

Forest plots of mean RT and network scores for the young adult bilingual comparison

Fig. 4
figure 4

Violin plots of the posterior distributions of the a intercepts and b differences between young adult monolingual and bilingual groups

Finally, we applied this same method of analysis to the two studies using middle-aged adults.Footnote 4 Where there were only two studies in this analysis there was not much information for the posterior models’ priors to be update. However, this should not induce any kind of bias and still reflect a rational characterization of what one should believe after seeing the data. Forest plots for the middle-aged group are presented in Fig. 5 and the posterior distribution of the intercepts and differences in Fig. 6. The CrI for the mean RT (− 62 ms [− 225, 131]), alerting (2 ms, [− 19, 21]), orienting (13 ms, [− 12, 32]), and executive networks (− 28 ms [− 64, 15] were non-significant.

Fig. 5
figure 5

Forest plot of the mean network scores in the two middle-aged of bilinguals and monolinguals

Fig. 6
figure 6

Violin plots of the posterior distributions of the a intercepts and b between group differences of middle-aged bilingual participants

Discussion

The foundational influence of Costa’s research into bilingualism using the ANT is apparent as his seminal papers are cited in every one of the studies analyzed in the present project. The findings from these three meta-analyses are summarized in the corresponding sections below.

Children

Bilingual versus monolingual group differences in the child analysis were not significant on any measure. This is consistent with the findings in each individual study save for Kapa and Colombo (2013) who reported faster overall mean RT in their early acquisition bilingual group, which contrasted Arredondo’s (2017) findings in which bilingual participants were slower. All studies in our child analysis aimed to match or minimally, thoroughly evaluate participants on measures of external variables that may interact with bilingualism. All studies in our child analysis reported on variables such as parental socio-economic status (or education), participant fluency in L1 or L2 in bilingual cohorts and intelligence, as assessed by various measures. Almost all bilingual participants were raised in the same country as their monolingual counterparts, save for in Antón-Ustaritz (2017) in which bilingual participants were from Basque, an autonomous region in Northern Spain, and Kapa and Colombo (2013) which only reported that participants spoke English and Spanish at home before the age of 3 in the early-acquisition bilingual group, and only Spanish before 3 in the late-acquisition group. Three studies reported differences in the socioeconomic status between participants and used this as a covariate in their analysis, however, still reported no differences between the bilingual and monolingual groups.

Young adults

Bilingual versus monolingual group differences were not significant on measures of mean RT, alerting, or orienting, but showed credible non-zero values for differences in the executive network. All but one study (Vivas et al. 2017) reported such executive differences between these two groups. Four studies reported faster RTs (Yang and Yang 2016; Tao et al. 2011; Ooi et al. 2018; Desideri and Bonifacci 2018) and three reported more efficient use of the alerting cue (Tao et al. 2011; Sabourin and Vīnerte 2019; Marzecová et al. 2013; Desideri and Bonifacci 2018) with bilingual participants.

Included in the present meta-analysis was Costa’s 2008 paper which began to set a framework for establishing better measurements of true bilingual differences. Of the seven other studies included, all but one collected measures of intelligence (as measured by Raven matrices), socio-economic status, rural/urban status, or parental education. Three reported differences between the monolingual and bilingual groups in at least one of these measures However, all three of these studies found statistical differences in executive performance between at least one of their bilingual groups when using these between-group differences as covariates. Additionally, two of these studies also found significant differences in mean RT between at least one of their bilingual groups and other monolingual and bilingual groups in their respective studies (Tao et al. 2011; Ooi et al. 2018) while Tao et al. (2011) also reported bilinguals taking more advantage of the alerting cue.

Though participants were relatively well-matched in the young adult analysis on the variables described above, there were a large variety of L1 and L2 languages spoken with participants originating from numerous different cultures. In Costa et al. (2008), although participants from the two groups originated from different geographical regions, as they emphasized in their appendices and we noted in our introduction, their groups were likely very-well matched on non-linguistic variables. The seven other studies included in this analysis, however, did not have the same stringent methods for inclusion. Using the Hofstede Insights web-tool based on Hofstede et al. (2010) we discovered that in four of these studies bilinguals were likely drawn from collectivist societies and monolinguals were likely drawn from individualistic societies. Whereas all four of these studies reported bilingual advantages, the confound opens the door to the possibility that these advantages are culturally rather than linguistically mediated (cf, Arora et al. 2020; Paap et al. 2015).

Along with Costa et al. (2008) three other studies are not subject to this possibility. Of these four, three reported bilingual advantages in executive functioning. One (Desideri and Bonifacci 2018) also reported faster RTs in the bilingual group. The exception, Vivas et al. (2017), found no significant group differences and in fact reported monolingual advantage in overall RTs. However, when controlling for early vs. late bilingual acquisition, this monolingual advantage remained mathematically present but was no longer statistically significant.

For this group, our findings from this analysis are consistent with Costa et al. (2008) but contrast with the “now you see it; now you don’t” subtitle of Costa et al. (2009). These findings also contrast with the view that bilingual advantages in attention in young adults might not be seen because of ceiling effects (optimal executive control) at this age (Bialystok et al. 2005; see also Bialystok 2015, personal communicationFootnote 5).

Middle-aged adults

We were only able to find two studies that looked at bilingualism with middle-aged participants using the ANT. Although the results of this analysis are therefore somewhat limited, this model should nevertheless produce a rational characterization of these demographics. As with our child age group, no significant differences were found in any of the three network scores. In both studies, bilingual and monolingual participants were raised in the country in which the experiment was conducted were evaluated based on income and occupation. Only one reported socioeconomic status, literacy, and familiarity of L1 vs L2 in bilinguals (Nair et al. 2017).

It is certainly a limitation of this analysis that there are only two studies, both of which were relatively under-powered (see Table 2). However, other literature using the ANT with older bilingual populations have also reported no significant differences (Borsa et al. 2018; Mishra et al. 2019) in the behavioural measures analyzed here.

Conclusions and caveats

Because the literature we have analyzed here was inspired by Costa’s seminal research using the ANT, we will summarize our findings in relation to the conclusions from Costa et al. (2008) presented in the introduction:

  1. a)

    Costa et al. reported that “Bilinguals were faster than monolinguals irrespective of whether the trial was congruent or incongruent.” In our present analyses, we did not find a credible bilingual advantage in overall RT in any age group.

  2. b)

    Costa et al. reported that “Bilinguals suffered less interference from incongruent flankers than monolinguals.” We found a credible bilingual advantage in flanker interference in young adults (the same age group as in Costa et al.) but there was not a credible advantage in either children or middle-aged adults.

  3. d)

    Costa et al. reported that: “Bilinguals took more advantage of the alerting cue than monolinguals.” In none of the age groups we analyzed did we find a similar alerting benefit. Consistent with Costa et al. we found no differences in orienting between monolinguals and bilinguals.

It is important to leave the reader with some benefits and limitations of this focussed review. Because the ANT is aimed at assessing several components or networks of attention, we believe it is a particularly useful tool for exploring the question posed at the beginning of our abstract: “Are there differences between bilinguals and monolinguals in non-linguistic cognitive processes related to attention?” Moreover, despite minor differences in implementation of the ANT across studies, we are analyzing results from publications using relatively similar methods and measurements; such methodological homogeneity is considered a benefit for the kind of pooling entailed in our meta-analysis. But methodological homogeneity limits generalizability and opens the door to the possibility that the test (the ANT) has missed a particularly telling aspect of attention, and one for which a bilingual advantage might have been more consistently observed.

In this regard, it is worthwhile considering our highly focussed meta-analytic findings in the context of the wide variety of tasks that have been applied to the question of interest (for a thoughtful review of such tasks, the reader is referred to the appendix in Valian 2015). In the first arm’s length review of the question, “whether bilingualism is associated with executive control advantages”, Hilchey and Klein (2011) focussed on three tasks that, in the literature, had been used to measure executive control: flanker, Simon and spatial-Stroop. Their review and graphic meta-analyses found scant evidence for a bilingual inhibitory control advantage (BICA) but relatively consistent evidence for a global advantage in reaction time on the tasks reviewed, that they construed as a bilingual executive processing advantage (BEPA).

In two more recent meta-analytic reviews covering a much larger literature and range of tasks than was covered by Hilchey and Klein, Donnelly, Brooks and Homer (2019) and Lehtonen, Soveri Laine, Järvenpää, De Bruin and Antfolk (2018) found weak overall effect sizes that were often eliminated when publication bias was taken into account. In contrast, Grundy (2020, this issue), in a meta-analysis that purposely doesn’t distinguish amongst tasks or measures, concludes that the relative frequency with which bilingual (as opposed to monolingual) advantages occur far exceeds what would be expected by chance in the absence of a true effect. The last sentence of Grundy’s abstract (a similar conclusion has been reached by others, e.g. Laine and Lehtonen 2018) is forward-thinking: “…these findings are not at odds with recent metaanalyses examining overall effect sizes, but rather, highlight the need to determine when, rather than if, bilinguals outperform monolinguals on EF tasks.”

It is important to note as many authors have, that simply finding a bilingual advantage in a cognitive process, like executive control, does not tell us that the advantage was caused by bilingual exercise. As noted by Peal and Lambert (1962; about the “more intelligent bilingual child” in Montreal) and many other authors, it is possible that those with better executive control achieved a more fluent level of bilingualism.

Perhaps the most appropriate conclusion from the findings generated by our meta-analyses is best summarized by Costa himself (2015) who, at an NSF-funded, CUNY-hosted workshop on Bilingualism and Executive Processes (https://bef2015.commons.gc.cuny.edu/program/) concluded that the “link between bilingual language control and domain executive control is an elusive one”.