Introduction

Understanding the complexity of humans’ cognitive system with respect to processing linguistic repertoire has long been a question of great interest in a wide range of fields such as theoretical and applied linguistics, psycholinguistics, neurolinguistics and even cognitive science to name a few. Besides, a myriad of theories and models have been put forth to capture the essence of the wonderfully elaborate structure that underlies the brain and language’s interlacing communicative faculties. Of these theories and models, some will be delved into in the current study, the BIA + model of language processing, the Auditory Dorsal Stream model with regard to working memory capacity (WMC), the spreading activation theory of semantic processing and the Wernicke–Geschwind model can be instanced. Notwithstanding the fact that these theories and models have revealed a substantial body of empirical evidence by means of which one can steer through the intricacies of human language processing to a great extent, the potential implications in such areas as education, pedagogy, speech pathology and several related fields have proved impetus; furthermore, striving to refine the teaching and learning methods in use today warrant further research on the subject at hand.

Similarly, ongoing challenges faced by would-be polyglots whose struggle hinges upon certain neglected aspects of multilingualism have attracted the sustained attention of a multitude of researchers over the last few decades. For instance, as has been mentioned by Zheng and Lemhöfer (2019, pp. 7–8), some of these so-called challenges relate to L2 language processing which has been shown to emanate from factors such as “fewer available cognitive resources as well as lack of automaticity during online processing”. One of such factors, namely the allocation of cognitive resources (WMC to be exact) during on-line language processing, has been explored in the present study.

Any want-to-be bilingual or multilingual would attest that acquiring a new language is an arduous process that despite pouring considerable effort into mastering it, one’s efforts may not always bear fruit. Similarly, many language practitioners claim that not every individual picks up a language or its components at the same pace. For instance, O’Brien et al. (2007) maintain that phonological memory can prognosticate the degree of success for L2 oral fluency in adults. Likewise, it has been stated that the pedological basis of L2 acquisition should be founded upon models of working memory (e.g., Atkins & Baddeley, 1998; Baddeley, 2003, 2015). To exemplify, Baddeley (2015) believes since much of L2 pedagogy is contingent upon effectual learning of language components such as vocabulary and grammar, the importance of phonological loop (i.e., a part of the working memory system that holds information temporarily) is underscored by substantial evidence.

One of the most important aspects of one’s cognitive performance is verbal working memory which influences language comprehension and production to a considerable extent. According to Traxler (2011, p. 437), working memory and phonological memory can affect how quickly an individual can acquire an L2. By the same token, it has been argued that verbal working memory can predict success in L2 vocabulary acquisition as well (Atkins & Baddeley, 1998). However, assessing working memory in isolation and without taking other variables (such as in off-line and on-line language processing) responsible for language processing into account will probably deprive researchers of appreciating the overarching structure of language processing. Of these variables, linguistic knowledge, attention, teaching strategies and age can be mentioned.

On that account, the present investigation examines some of the factors that have been identified in the literature as influential in bilingual language processing. In particular, the investigation involves on-line language processing in bilinguals which resembles a real-world situation where linguistic input and output are not isolated. Such investigations determine how bilinguals and L2 learners gain access to and use semantic information in their L2. On that front, Mcdonough and Trofimovich (2008) profess that evaluating learners’ ability to efficiently allocate attention between processing tasks (i.e., the size of working memory) can determine how semantic information is activated in the learner’s mind and how this information interacts with various linguistic data available during bilingual language processing, including phonological, syntactic and morphological information.

Bilingualism

Multidimensionality of bilingualism has forced theorists to categorize the concept by introducing numerous terms to define the boundary of each, granted that making an express pronouncement in this regard has proved fairly difficult. For instance, Li (2000) uses more than 30 specific descriptive terms such as “additive bilingual”, “asymmetrical bilingual”, “diagonal bilingual”, “dormant bilingual”, “incipient bilingual and maximal bilingual” to discriminate between varieties of bilingualism. By the same token, Ahlsén (2006, p. 122) also maintains that there are different subcategories of bilingualism in the literature, but three, namely compound bilingualism, coordinated bilingualism and subordinated bilingualism have been identified as the most common ones. In the third subcategory, one language, typically L1 is dominant and has already been mastered and L2 is made use of for mediation (2006, p. 122). For the sake of exactitude, the current study will address the third subcategory, subordinate bilingualism, which is believed to be the prevailing type amongst Persian speakers of English.

Research in L1 processing has illustrated crucial points with regard to the semantic association that may exist in the memory and human mind in general and also underlined the fact that semantic priming can demonstrate the internal organization of one’s lexical knowledge as well as the way this knowledge is made effective use of (Ratcliff & McKoon, 1988). Generally, semantic association refers to the relation two linguistic elements (i.e., words) bear, directly or indirectly, that can be regarded as meaningful. For instance, this happens when one associates the word “engine” with the word “car” semantically. Semantic priming, in turn, is one of the subcategories of the phenomenon of priming that can be defined as “the improvement in speed or accuracy to respond to a stimulus, such as a word or a picture when it is preceded by a semantically related stimulus (e.g., cat–dog) relative to when it is preceded by a semantically unrelated stimulus” (e.g., table–dog) (McNamara, 2005, pp. 3–4). By that account, it should not come as a surprise that many researchers in language processing literature have broadened their support of priming studies and have looked into the linguistic development of individuals amongst various groups of learners; these studies have examined subjects such as children (e.g., Arias-Trejo & Plunkett, 2009; Betjemann & Keenan, 2008; Luchkina & Waxman, 2021; Nation & Snowling, 1999), the elderly (e.g., Allen et al., 1997; Ouyang et al., 2020; Rogers, 2016), adults and children (e.g., Radeau, 1983) as well as bilinguals and multilinguals (e.g., Hartsuiker et al., 2016; Jiang & Forster, 2001; Tytus & Rundblad, 2016).

Memory and Language

A considerable amount of literature has been published on the relationship between language and memory. These studies have described various aspects of memory encoding of linguistic resources from empirically diverse research designs at great length. Implicit memory in particular has received considerable attention in the priming literature as priming demonstrates the effects of implicit memory in general (Graf & Mandler, 1984; Keane et al., 2015; Schacter, 1987). Manipulating working memory load to investigate to what extent the aforementioned processes rest upon cognitive resources (Heyman et al., 2015, 2017), assessing the effect of working memory load on implicit visual memory of subjects in a semantic categorization task (Castellà et al., 2020) and examining the role of working memory load on long-term priming of written words (Baqus et al., 2004) are among such studies. In addition, working memory and different aspects of language acquisition have drawn close attention (Baddeley, 2003); one aspect that concerns the current study, i.e., new linguistic component integration, has been researched for a long time (e.g., Atkins & Baddeley, 1998; Cowan, 1992; Morra & Camba, 2009). Studies on the relationship between working memory and priming have explicated the effect of WMC on different types of priming (e.g., Castellà et al., 2020). To give an instance, negative priming effect has been determined to adversely influence responses to the same previously exposed stimulus (e.g., a word).

Priming-based studies are not the only investigations that have been set out in order to put the relationship between working memory and language learning and/or performance to the test. The methodological innovation of these studies lies in varying tests utilized to measure learners’ WMC and the ways it can affect language performance when one is taking on a specific linguistic task (e.g., language tests or exhibiting one’s language skills in certain circumstances). In parallel, several studies attempted to evaluate the impact of WMC in the field of second language acquisition (SLA), one of which was carried out by Kormos and Safar (2008). The study examined the association between phonological short-term memory and WMC in a reading, listening, speaking and correct English use test. The results reported in the study suggest that the backward digit span test co-varied with the “overall English language competence” in addition to reading, listening, speaking and use of English (syntactic and semantic structures) test scores. Furthermore, phonological short-term memory capacity was identified as a component that contributes to the performance of beginners and pre-intermediate students in intensive language learning differentially.

A further piece of research on the effects of WMC on bilinguals is a comprehensive meta-analysis study conducted by Grundy and Timmer (2016). It has been claimed that bilingualism gives bilinguals an edge in certain situations such as performing executive function tasks or when cognitive gymnastics proves essential to the task undertaken. As the researchers of the study have pointed out, some maintain that in addition to the supposed advantages associated with bilingualism greater WMC compared to monolinguals might be thought of as another trump card for bilinguals to play with, although empirical evidence on that front seems inconsistent.

In order to bring the possible effects of L2 acquisition on the mental lexicon under investigation, Brien and Sabourin (2012) tested a diverse group of bilinguals including 14 simultaneous bilinguals, 17 early bilinguals, 11 late bilinguals in addition to 10 monolinguals. Using a cross-modal priming (CMP) paradigm, the researchers explored the influence of L2 on the processing of homonyms in L1. The results demonstrated no significant differences between early bilinguals and monolinguals whereas late bilinguals exhibited longer reaction times, syntactic priming effects and lexical frequency effects. The aforesaid study also shed light on the effects of semantic context on lexical access of ambiguous words. As a further matter, it appears that the physiology of the mental lexicon of second language learners may differ depending on the age of L2 acquisition. Learners whose L2 acquisition took place early in their childhood demonstrated recruitment of the same brain areas for language processing as their monolinguals. Conversely, late L2 learners seem to deploy other areas of the brain.

One of the most significant studies conducted on the subject of semantic context effects on lexical access of ambiguous words during sentence comprehension is conducted by Swinney (1979). Comprehension of sentences calls for information integration grounded in a number of ongoing cognitive processes; for instance, there is no denial in the fact that semantic and syntactic contexts interact with real-time comprehension processes and can affect one’s interpretation of individual words and sentences as evidence for which are numerous in the literature (Swinney, 1979, p. 645).

In respect of the interplay of context and lexical processing, models of lexical ambiguity resolution have been introduced (Brien & Sabourin, 2012). The most prominent of such models is cross-modal lexical decision task which has been used in the literature to investigate the activation of lexically-ambiguous words. One of the most consequential findings of Swinney’s (1979) research was that in the absence of prior disambiguating context, “all possible meanings of an ambiguous word are accessed initially”, and it seems that “it is only in the subsequent selection stage that one meaning is preferred” (Brien & Sabourin, 2012, p. 196).

In general, language studies have demonstrated various aspects of the phenomenon of on-line language processing, mostly during tasks that were targeted at monolinguals. Such studies were of great service to psycholinguists and culminated in a richer understanding of the underpinnings of word processing, speaker’s executive function and data acquisition. In turn, gaining a more comprehensive grasp of word processing in general, and language processing in particular, meant that a new wave of theoretical frameworks could be devised to be applied in real-world scenarios (i.e., in language classrooms).

The overall objective of the present study was to apprehend the effect of individual differences from a perspective that explains difficulties in L2 semantic processing. To do so, evaluating the VWMC as well as investigating the semantic processing of novel linguistic data during on-line language comprehension can be a step towards addressing the aforesaid difficulties in L2 semantic processing.

Taking note of what has been discussed thus far, the research questions of the present study are as follows:

  1. 1.

    Is there a difference between Persian–English subordinate bilinguals of analogous proficiency levels with regard to L2 VWMC?

  2. 2.

    Part 1: Do individual differences in VWMC if proved by empirical evidence, impact semantic activation of novel L2 intralingual homographs in on-line language processing?

    Part 2: If so, in what manner does this impact show itself, in particular, during lexical ambiguity resolution?

As per what has been discussed in the related literature, the following one-tailed hypotheses were put forward:

H1

Subordinate bilinguals of similar proficiency levels would demonstrate varying L2 verbal memory spans.

H2

Subordinate bilinguals of higher working memory span would display lower reaction time delays in the processing of semantic information compared to lower-working-memory-span participants.

Method

Design

The study was based on a quantitative repeated-measure factorial design.

Participants

Nineteen Persian–English bilinguals (sixteen males and three females) whose linguistic and educational backgrounds as well as ages differed were recruited from the general population to partake in the study. The recruitment of the participants spread through a 68-day period. As regards participant recruitment, snowball sampling was opted for in order to recruit the majority of the participants. Furthermore, in the case of the current study, the proficiency level of the participants as well as being identified as subordinate bilinguals were of significance. At the time of conducting the experiments, the participants were given private English lessons to get them prepared for common standardized language proficiency tests (i.e., IELTS or TOEFL) by one of the researchers. To compensate for their time and to incentivize them to carry out all the required tasks, the teacher-researcher waivered two third of the actual tuition fee.

The age of the participants ranged from 21 to 38 years (M = 27.588, SD = 4.728). Every participant was asked to fill in a form giving his/her consent to be part of the study and was rest assured that the to-be-gathered data from an individual shall remain strictly confidential. Since the study lays emphasis on subordinate bilinguals as the main subgroup of interest for the researchers, every single participant was scheduled for a one-to-one interview; subsequently, the collected data from the interview procedure as well as the analysis of the Language Experience and Proficiency Questionnaire (LEAP-Q) determined whether an individual was suitable as a participant. As for the proficiency level of the participants, the results of the self-report questionnaire as well as a short one-to-one interview revealed that all of the participants were at the upper-intermediate level of English proficiency [the approach taken to assess the proficiency level of the participants was similar to that used in Marian and Fausey (2006)]. Also, none of them knew more than two languages (Perian and English).

Instruments

Experiment 1: Framework and Materials

The theoretical framework of Experiment 1 followed that of the complex span paradigm. Such a paradigm has been recognized as one of the most frequently used groups of experimental tasks for measuring an individual’s WMC. Complex span paradigm can include tasks such as reading span task (which has been used in the current study), operation span task and rotation span task (Mcdonough & Trofimovich, 2008).

The stimuli for the reading span task in Experiment 1 were chosen by the researchers and were not part of any other study. The sentences were chosen to be syntactically, semantically and lexically simple enough in accordance with the proficiency level of the participants; this was decided so to make sure that the results would be explained mostly by the participants’ cognitive performance and not by one’s linguistic deficiencies. For instance, the sentences included words with which participants would be highly familiar (The exact list of the materials used in Experiment 1 can be found in Appendix B). As for the presentation of all the trials, computerized materials were displayed using PsychoPy 2021.1.4.

Experiment 2: Framework and Materials

Experiment 2 was founded upon the paradigm of CMP. The aforesaid paradigm is an on-line method designed to investigate the automatic processing of linguistic input in the mind (Roberts, 2013, p. 212). According to Marinis (2018), a reliable psycholinguistic method developed by David Swinney, the Cross-Modal Priming Task (CMPT) measures the activation of lexical and syntactic information during sentence comprehension. The aforementioned method has been considered one of the best experimental methods to investigate the process of ambiguity resolution in both monolingual and bilingual language processing.

Eighteen English homographs were chosen as the experimental stimuli for the CMP study in Experiment 2. Each experimental stimulus presented involved presenting priming or non-primed (i.e., control) items followed by visually-presented target (probe) items. The probes were either semantically related or unrelated to the preceding primes. These probes were what each participant made a semantic decision on. The visually-presented probes had one of the three relations to the prime: they could be semantically related to the subordinate reading of the homograph (i.e., contextually appropriate) such as “Enclosure” in “We have also converted sheep pens into extra living space so when friends come, they have their own kitchen, bathroom and bedroom.” (“Pen” is the prime and “Enclosure” is the probe here.); they could be semantically related to the dominant reading of the homograph (i.e., contextually inappropriate) such as “Writing” in the example sentence above; or they could be unrelated such as “Clock” in the example sentence.

All the sentences were designed to be contextually biased towards the subordinate meaning of the homographs (the meaning that participants had to encode prior to Experiment 2). Each homograph appeared in the sixth or seventh position in the sentences. Interspersed amongst the experimental stimuli, fillers would also appear between predetermined trials. Any response to these fillers did not count towards the performance of each participant in the task. Generally, fillers are non-experimental sentences that have nothing to do with the experimental stimuli of the study; they do not contain any prime, probe or control words. The aim of including a certain number of fillers is to preclude the subject to hazard a guess or devise a strategy to carry out the tasks (Roberts, 2013). Also, the filler would be paired with pseudowords rather than real English words. All of the experimental sentences were chosen from the Oxford Dictionary [Oxford University Press, (2021)] but a few alterations were made to each sentence to make sure they fit the purpose of the study (e.g., the positioning of the homograph may have been changed or certain words may have been added or omitted) (see Appendix C for the exact positioning of the experimental stimuli).

Being simple declarative, all sentences followed the same grammatical structure. Since lexical processing is sensitive to various factors such as frequency, extra caution regarding the selection of prime and target probes should be exercised (Roberts, 2013, p. 220). Bearing that in mind, prior to the introduction of the homographs, the frequency data was accounted for. As Brysbaert et al. (2017) contend, it is predicted that word frequency can have a potentially important role to play when it comes to general lexical access and processing and in particular in the disambiguation of lexical items. The researchers of the aforenamed study also maintain that lexical access and the CMPT are quite sensitive to word frequency.

On that ground and based on the suggestion of Brysbaert et al. (2017), a scale known as Zipf scale was picked out as the measure of frequency calculations in the present study. In actuality, the Zipf scale runs from 1 (1 per 100 million words) to 6 (1000 per million words). The lower half of the scale (1–3) entails low-frequency words; in contrast, the upper half of the scale (4–6) represents high-frequency words. The log frequency of experimental stimuli ranged from 3.08 for the homograph “bonnet” to 5.30 for the homograph “issue” (M = 4.27, SD = 0.59).

One-to-One Interview

As a means to test the participants’ English proficiency level and to ensure each participant was a good fit for the study, a short online one-to-one interview was conducted. The interview encompassed a brief account of the participants’ linguistic and educational background. During the interview, the timeline of the study was also explained to the participants in order to decrease any likelihood of a participant dropout.

LEAP-Q

In order to assess the linguistic background of the participants thoroughly, the LEAP-Q was administered. For the sake of convenience, the questionnaire was distributed electronically. Even though the in-depth analysis of the LEAP-Q in the context of language learning is not the focus of the current study, it is pivotal to note that the aforementioned questionnaire is a fairly validated and reliable tool for collecting self-reported data on linguistic proficiency and language status of bilinguals whose age ranges from 14 to 80 years old. The sheer number of studies that have exploited the tool for more than a decade as well as the fact that the questionnaire has been validated against behavioral measures of speech and language ability in correlation analyses pinpoint the rationale behind using it (Kaushanskaya et al., 2020; Marian et al., 2007).

Reading Span Task

In Experiment 1, the L2 WMC of each individual was assessed using a commonly used class of memory span measurement tasks known as the complex span paradigm. The tasks chosen for the present study were adapted from Wilhelm et al. (2013). In particular, a complex span task called the reading span task was opted for.

Simple Translation Task

Prior to conducting Experiment 2, the participants were required to memorize a list of homographs whose subordinate meaning could be considered “novel” for them. To make sure that the subordinate meaning of the homographs was new to the participants, a list of highly familiar homographs was chosen by the researchers and the participants performed a simple word translation task. They were asked to write the meanings of the homographs presented in the list either in English or in Persian. They were informed that the homographs have more than one meaning and that they should provide both meanings of each. A participant’s inability to offer any English definition or Persian translation of the subordinate meaning of the words meant that he/she is unfamiliar with them. All of the homographs in the list were taken from Gaskell et al.’s (2019) study. At last, 18 homographs were settled on. All of them were nouns.

CMPT

The exact mechanisms of the method used in Experiment 2 did not differ from a normal CMP study. The design of all the trials in the study followed the conventions of a typical trial of a CMP described by Roberts (2013). A normal trial in a CMP experiment involves participants encountering a sentence presented auditorily via headphones, whilst seated in front of a monitor or laptop screen. The participant is instructed to pay attention to the sentence and try to comprehend it. But before the sentence comes to an end, subjects will see a visual probe target which can be a word or a picture on the screen in front of them. The participant then should make a response to the probe as quickly as possible. The type of response is decided beforehand by the researcher(s) which could be a simple lexical or semantic decision task. The exact moment when the probe would appear on the screen is predetermined by the researcher(s) also (e.g., 500 or 1000 ms after the prime). Once the participant made his/her response and the auditorily-presented sentence came to an end, there might be a comprehension question about the sentence to make sure that the participants were attending to the meaning of the stimuli. After one trial is concluded, the next trail can appear timing of which is determined by an intertrial interval (ITI).

Procedure

Experiment 1

During the process of the reading span task, the participants were required to recall letters in the context of a rather simple reading comprehension task. the participants saw several sentences presented consecutively on a laptop monitor. Each sentence was paired with a single letter which would appear below or in front of its corresponding sentence. The participants had to memorize the letters whilst evaluating the meaningfulness of the presented sentence; for instance, “You can eat music from an online store.” would require a no response in which case the participant had to press the “n” button on the keyboard. If the presented sentence made sense, the participant would press “y”. All the other keyboard buttons were deactivated so that accidental press of any key would not result in the reception of a not-defined response. After the assessment of the last sentence in each trial, a recall cue on the screen would indicate that the participants should type in the letters of that particular trial in their correct serial position. Sentence-letter pairs were presented without time constraints during the memorization and recall phases. The moment the participant made his/her response with regard to the sentence meaningfulness, the next stimulus pair would appear. The intertrial interval (ITI) of 1000 ms was selected throughout all of the main trials.

During the main trials of the study, two to five load levels were set. Three practices and 12 main trials were administered for the reading span task with each load level consisting of three main trials. The scoring procedure for each main trial was the proportions of letter-stimuli recalled in their correct serial position. Subsequent to the completion of all the 12 main trials in Experiment 1, the average score of each learner across all trials was calculated as their final WMC score. It took between 11 and 17 min for each participant to complete the reading span task. Each administration of the task took place in a quiet room where only the participant and the researcher were present.

Learning Phase

Subsequent to the selection of experimental stimuli, the participants had to partake in a two-session learning phase during which unknown meanings (i.e., subordinate meaning) of the homographs would be learnt. One of the underlying reasons behind opting for the conception of the learning phase was to control the encoding process of the linguistic input as much as possible.

The learning phase comprised two sessions stretching across 48 h. Each session took approximately 25 min. Each homograph was presented three times in each session. The presentation of the homographs was structured by blocks of trials. Each block encompassed one presentation of each of the 18 homographs (18 trials by block). Once a block came to an end, a new block would start until the third and last trial. Every trial started with the visual presentation of the homograph for 5000 ms. Then a visual depiction (i.e., an image) of the subordinate meaning of the homograph appeared for 3000 ms. Finally, an English definition of the homograph along with two examples appeared. No Persian translation was provided since learning the materials was based upon the Encoding Specificity Principle. The aforementioned principle states that the context in which a piece of information (e.g., a memory) has been encoded can directly influence the degree to which the same piece of information will be retrieved successfully. For instance, according to the findings of a recognition-memory task by Tulving and Thomson (1971), the accessibility of an encoded concept is controlled by so-called “retrieval cues” whose availability depends upon the context in which encoding has taken place.

The English definitions as well as examples were retrieved from the Oxford Dictionary (Oxford University Press (OUP), 2021). Each example was used to demonstrate the context in which each meaning of the homograph (dominant and subordinate) is used. None of the sentences used in the learning trials were similar to those used in the main trials of Experiment 2. The intertrial interval was 1000ms. The order of presentation of the homographs was completely randomized so that participants would not be able to hazard a guess or strategize as to how the experimental stimuli would appear in the main trials of Experiment 2. Approximately 24 h following the completion of the first session, each subject was required to finish the same three blocks of trials again to expedite the consolidation of the meanings related to each homograph in the form of the second session; doing so was considered essential as many studies have shown that “retention and integration of new linguistic knowledge can benefit from a consolidation period and sometimes from a sleep period” Gaskell et al., (2019, p. 1). Each participant completed the sessions individually.

Experiment 2

The study’s 17 participants were assigned randomly to receive an equal number of 18 experimental sentences. Each participant would receive any one of the three different types of probe condition (i.e., semantically-related, contextually-appropriate, semantically-related-contextually inappropriate, unrelated). The order of sentence presentation was pseudo-randomly organized. The order of presentation was also counterbalanced across all participants and across all trials. During Experiment 2, no participant heard the same homograph and saw the same probe twice.

Experimental sentences were digitized and added to the CMP study in PsychoPy’s interface. Similar to Experiment 1, the participants did the CMP experiment individually and in a quiet room where only the participant and the teacher-researcher were present. In addition to two short practice trials, each participant was given clear instruction on how to complete the task. It took approximately 12–14 min for each participant to complete Experiment 2.

Data Analysis

The experiments designed for the current study as well as practice trials and learning phase were created and run through the open-source software package PsychoPy 2021.1.4 running on Windows 10 1903. Statistical analyses of the data and the results of the priming tasks were made by means of SPSS 26.00.

Results

Examination of L2 VWMC

The first research question concerned the possible variation in the performance of subordinate bilinguals in an L2 VWMC task, especially considering that the participants’ linguistic skill remained identical. Experiment 1 involved measuring the participant’s WMC using a reading span task. At the end of Experiment 1, differences in participants’ performance made it quite simple to categorize each subject into one of the high or low WMC groups. Figure 1 clearly demonstrates the performance of each member. The final scores ranged from 0.26 to 0.85 (M = 0.49, SD = 0.15). Since the maximum score in the reading span task was 1.00, any participant with a score greater than 0.50 was labelled as a high WMC group member whereas a score below the cut-off point was considered as low WMC.

Fig. 1
figure 1

Line chart of low and high WMC group by score (experiment 1)

One of the intriguing points about the scores of WMC in Experiment 1 was the fact that although the participants shared quite a substantial amount of language background and experience, the WMC scores were disparate. This point becomes clearer if one interprets the findings of Experiment 1 using the self-report questionnaire filled in by the subjects at the start of the study.

Findings of the LEAP-Q

As it was stated earlier, at the start of the study, the LEAP-Q was administered to shed light on the linguistic background of the participants. For a more efficient organization of data, the participants were categorized based on their WMC score in Experiment 1. Table 1 summarizes the participants’ responses to the questionnaire items as well as how each member of the high or low WMC groups differed in their background of language learning. The table incorporates data such as the age when one took part in the current study, proficiency in understanding, speaking and reading in both Persian and English, onset age of acquisition for both Persian and English, age of attaining fluency in either language, number of years spent amongst family members where either language was spoken, number of years spent in workplace or school where either language was spoken and current exposure to Persian and English.

Table 1 Responses of the participants of high and low WMC group to the questionnaire

Data Treatment

Before carrying on to answer the second research question, a few points about the handling of the data collected are necessary to clarify. Since there are certain assumptions underlying the use of parametric statistical models such as Analysis of Variance (ANOVA), it seemed pertinent that the researchers appreciated the details of the data they were working with prior to performing any analysis. As such, in order to investigate the distribution or symmetry of the data, a test of normality was run for each set of data. The Shapiro–Wilk test values of normality to assess the symmetry of the data was used. It has been said that if the Sig. value of the Shapiro–Wilk test is greater than 0.05, the data can be considered normal. As can be seen in Table 2, all Sig. values for the Shapiro–Wilk test are greater than the designated 0.05 cut-off point.

Table 2 Tests of normality for the data sets

Even though the test of normality (Shapiro–Wilk) did not indicate massive deviations from normality, the researchers remained skeptical of such an outcome initially. This was due to two factors: firstly, the small sample size was considered a major issue to grapple with, especially regarding the normality of data. Secondly, as noted by Hair et al. (2013, pp. 71–72), researchers should exercise caution when running tests of normality such as Shapiro-Wilks or Kolmogorov–Smirnov; this is because the usefulness of these tests of significance for small sample sizes (fewer than 30) or large sample size (greater than 1000) may be less powerful. Thus, researchers should consider a combination of measures to evaluate the degree of non-normality. On that account, tests of skewness and kurtosis were both run the data for which can be seen in Table 3.

Table 3 Descriptive statistics including skewness and kurtosis for the data sets

According to the descriptive statistics of each data set for three conditions in Table 3, the skewness values are not close to 1 or − 1 which indicates that deviations from normality are not too extreme as skewness values around 1 or − 1 may reflect a large degree of departure from normality (Dancey & Reidy, 2017, p. 83). Additionally, the three data sets show moderate skewness meaning that, as predicted by the researchers, the distribution of the data is not completely normal which considering the sample size stands to reason. Having said that, repeated measure ANOVA, which is the statistical model of choice for the current study (based on similar approaches taken by researchers in the literature), needs “approximate” normal data since ANOVA is considered robust in minor violations of normality, meaning that despite a minor violation of normality the results could still be valid (Laerd Statistics, n.d.).

Performance Variation Amongst WMC Group Members

Table 4 summarises descriptive statistics with regard to the WMC group in each condition of experimental stimuli. In particular, mean RTs (reaction times) as a function of WMC demonstrate differences between both the high WMC group and the low WMC group as well as variations in reaction time to each condition.

Table 4 Descriptive statistics of mean RTs as a function of WMC group membership

The differences in reaction times amongst participants show that the presented visual probes were not processed in a similar fashion. For instance, participants who were categorized as having low WMC had a latency in response time of about 202 ms in the first condition where all probes were semantically related to the homograph (prime) and contextually appropriate than those of the high WMC group. The gap of latency in response time between high-span and low-span participants became even bigger (233 ms) for the third condition (unrelated or control probes). Furthermore, Fig. 2 illustrates high and low WMC subjects and their respective reaction times to experimental stimuli in Experiment 2.

Fig. 2
figure 2

The line graph of each WMC group as a function of RTs

Lexical Ambiguity Resolution of L2 Homographs

The second research question raised in the current study was with regard to the possible interaction between each WMC group member’s performance and processing of semantic information (L2 homographs). Most importantly, it was argued that if such an interaction was to be found between subjects, the manner in which this potential interaction manifested itself in the processing of lexical items should be laid bare.

To further examine a potential interaction between WMC and reaction times to experimental stimuli in different conditions in Experiment 2, an overall 2 (between-subject variable: high vs low WMC) × 3 (within-subject variable: visual probe type) repeated-measures ANOVA was conducted. Table 5 showcases the tests of within-subject effects for said variables.

Table 5 Tests of within-subjects effects

Using the Greenhouse–Geisser correction, the results showed that there was a significant overall difference between probe type conditions (F (2,28) 134.154, p = 0.001); an overall effect size of 0.899 (partial ηp2) was observed. Also, data on the possible interaction between visual probe types and WMC group membership revealed a statistically significant difference in how the high WMC and low WMC groups processed the visual probes in each condition of Experiment 2 (F (2,2) 3.622, p = 0.043, ηp2 = 0.194). According to Table 5 above, the Sig. value for visual probe types (i.e., subordinate reading of the homographs in a contextually appropriate condition, dominant reading of the homographs in a contextually inappropriate condition and unrelated or control stimuli) points towards a statistically significant effect between the presented conditions.

Tests of between-subject effects also confirmed that there are significant differences between WMC groups with regard to the processing of each visual probe type. Therefore, it can be said that the main effects for each WMC group amongst three conditions in Experiment 2 are statistically significant, (F (1,15) 43.886, p < 0.001).

Table 6 below, using Bonferroni-adjusted t-tests, compares each WMC group’s performance with regard to reaction times, giving the mean difference between every group, the standard error, the probability value and the 95% confidence limits around the mean difference. The first row compares low-WMC participants with high-span participants. The mean difference in reaction times is 212 (p = 0.001) which indicates a statistically significant difference in each group’s performance confirming the fact that participants with high WMC do not perform similarly to their low WMC counterparts and vice versa. This outcome aligns with what was reported in Table 2 and also more clearly in Fig. 2. As Fig. 2 illustrates, the variation in response delays to each experimental condition in Experiment 2 is due to low-WMC group members being outperformed in the on-line processing of the stimuli on the grounds of lower WMC.

Table 6 Pairwise comparisons

Table 7 compares each condition with every other condition, giving the mean difference between every pair, the standard error, the probability value and the 95% confidence limits around the mean difference. The first row compares subordinate meaning and contextually appropriate experimental stimuli with dominant meaning but a contextually inappropriate condition. The mean difference is 53.77. This indicates a statistically significant difference between the two conditions. This row also compares the first condition with the control condition. The difference here is 104.06, and the associated probability level is > 0.001. The second row compares the dominant reading of the experimental stimuli with the other two conditions. Here, the mean difference between the second condition and the third condition is 50.29 which is a statistically significant value. It should be borne in mind that numbers 1, 2 and 3 in the aforementioned table refer to the conditions “subordinate meaning-contextually appropriate”, “dominant meaning-contextually inappropriate” and “unrelated or control probes” respectively.

Table 7 Pairwise Comparisons

Finally, Table 8 gives the overall description of the interaction of working memory and processing of each visual probe type. A quick look at the table suggests that control words caused the most delay in processing (i.e., took the longest to be processed) as regards both WMC group members; in that respect, high WMC group members processed the control words on average 759 ms whilst low WMC group members took 1028 ms to do so. The fastest response time for both groups belonged to the processing of semantically related and contextually appropriate conditions in which the participants with higher WMC performed approximately 202 ms on average quicker than their WMC counterparts (908.67 ms and 706.57 ms for low and high WMC groups respectively). The processing of semantically related but contextually inappropriate stimuli was also about 204 ms on average quicker for high WMC group members.

Table 8 Interaction between WMC group and visual probe type

The Effect of Working Memory on Lexical Processing of L2 Homographs

To further investigate the interaction between WMC and different types of visual probes, two repeated measures ANOVAs were conducted on the reaction times for the different WMC groups (one for the high WMC group and one for the low WMC group) and the types of visual probes processed by the subjects.

For the participants with a high WMC, the result revealed a significant effect on the processing of each visual probe type (F (2,11) = 54.64, p < 0.001, ηp2 = 0.886). Post hoc tests with Bonferroni’s correction were also performed. The findings showed a mean of 706.57 ms for high WMC participants to access the subordinate and contextually appropriate meaning of the homographs, 759.31 ms to access the dominant but contextually inappropriate meaning of the homographs and 795.33 ms to access the meaning of unrelated control words (all ps < 0.001).

By the same token, for the low-WMC participants, the repeated measure ANOVA findings revealed a clear main effect on the processing of each visual probe type (F (2,16) = 83.243, p < 0.001, ηp2 = 0.912). Post hoc tests with Bonferroni’s correction were also performed. The findings showed a mean of 908.66 ms to access the subordinate and contextually appropriate meaning of the homographs, 963.47 ms to access the dominant but contextually inappropriate meaning of the homographs and 1028.03 ms to access unrelated control words. These results clearly confirm the study’s hypothesis in respect of faster lexical access for participants with higher WMC.

Overall, the findings of Experiments 1 and 2 suggest that there is a statistically significant difference in the process of lexical access and retrieval between subordinate bilinguals of various WMCs. The general findings in respect of both experiments are encapsulated in Fig. 3.

Fig. 3
figure 3

Illustration of each WMC group performance in each experimental condition by mean RTs

Discussion

General Discussion

The present study aimed at examining the role of WMC as a cognitive system in subordinate bilinguals’ lexical access and on-line language processing using English (L2) intralingual homographs. In particular, lexical ambiguity resolution was investigated by means of a CMPT which is specifically designed to be used in such an area of research. In turn, as has been pointed out in the literature, factors such as working memory capacity, lexical access and resolution of ambiguity in L2 are to a great extent intertwined.

Based upon the findings of the studies on the matter, it was hypothesized that bilinguals’ cognitive capabilities can be considered as a component that explains dissimilarities in the performance of various bilingual groups in that a bilingual of a higher working memory span would display fewer delays of lexical access and processing than a bilingual of lower working memory span. For instance, Brien and Sabourin (2012), using the CMP paradigm, reported that simultaneous, early and late bilinguals do differ in the processing of stimuli (homonyms) in respect of reaction time. Similarly, the findings of the present study suggest that even bilinguals who can be categorized under the same umbrella of bilingualism and demonstrate a quite similar degree of language proficiency do not process and access linguistic data in the same fashion.

Furthermore, with regard to the on-line processing of sentences containing ambiguous words, the findings provide corroborative evidence as to the assistance of semantic context in resolving lexical ambiguity. Perhaps, the most significant piece of research on the subject is Swinney (1979). During the second experiment in Swinney (1979), it was argued that facilitatory effects were observed for the contextually appropriate meanings of the ambiguous words in the lexical decision task; he made arguments for the autonomy of the process of lexical access, at least with regard to the effects of semantic context. This effect of semantic context was replicated in the findings of the present study. Moreover, the pattern of response shown by the participants of the current study is in conformity with the Reordered Access Model (Duffy et al., 1989; Rayner & Duffy, 1987).

Individual Differences in Bilinguals’ Performance

One of the main areas of research in respect of the current study was the bilinguals’ individual differences in performance in the storage, activation and access of semantic information in their L2. Particularly, the capacity of working memory or an individual’s cognitive ability to access and hold information in the most efficient way possible between language processing tasks is of great interest since it is well known in the literature that WMC may influence the kind of processing integral to semantic priming (Mcdonough & Trofimovich, 2008, p. 93). The findings of the present study confirm such an influence as the subjects of high WMC displayed faster reaction times to the experimental stimuli compared to those of low WMC. Moreover, the statistically significant differences found between the processing of visual probe types clearly demonstrate that each participant processes semantic information differentially. Similarly, the major influence of working memory in language acquisition and processing should not be forgotten (Skehan, 2015, p. 189; Yi & Choi, 2021). Likewise, individual differences can account for a considerable amount of variance in language learning and processing (Falandays & Spivey, 2020; Li, 2022). In the case of the present study, lexical disambiguation seems to function differentially for individuals of high and low WMC (Assche et al., 2020, p. 58; Falandays & Spivey, 2020, p. 23; Miyake et al., 1994). In sum, the findings of both experiments along with the self-report LEAP-Q substantiate the empirical evidence in the relevant literature as to the existence of individual differences in language processing. These results signify the fact that factors such as cognitive abilities of each learner in addition to L1 and L2 exposure and experience are important variables to consider in the study of bilingual language processing.

The Influential Role of Prior Semantic Context During On-line Processing

According to the Reordered Access Model (Duffy et al., 1989; Rayner & Duffy, 1987), two main interacting factors can affect access to word meanings, namely, meaning dominance and contextual cues. Meaning dominance or the reading with higher frequency is generally easier to access compared to the less dominant meaning (i.e., subordinate meaning) or the reading with lower frequency. Equivalently, Brien and Sabourin (2012, p. 197) contend that as per the Reordered Access Model lexical access is “exhaustive” and that preceding contextual information as well as meaning dominance determines the order by which meanings are accessed, with contextually-biased and higher-frequency meanings of words being accessed more quickly than contextually-unbiased and lower-frequency meanings. This pattern of lexical access is in accordance with what was revealed in Experiment 2.

Comparably, as Traxler (2011, pp. 118–119) details, when one encounters a word, the bottom-up input activates all of the semantic representations in relation to the word. He maintains that word representations are organized according to the Trace model of lexical access in a way that “when more than one representation is activated by a word, the activated representations compete with one another” (pp. 118–119). But as for the processing of biased ambiguous words such as homographs that appeared in the current study, the lexical competition does not take long since the dominant reading of the word wins swiftly. If the frequency of the two readings of the ambiguous words is somewhat identical (i.e., balanced ambiguous words), the words are more difficult to process as the representations that are in competition take longer to win over others.

Similarly, the context in which the word appears influences the process of ambiguity resolution. When context and frequency of reading both favor the dominant meaning of an ambiguous word, competition between multiple activated readings of the word is transient in that the dominant meaning wins quite easily. And when context favors the reading with lower frequency, “its activation is raised to the point where it becomes an effective competitor with the more dominant meaning” (Traxler, 2011, p. 119). This means that the subordinate reading can be singled out when contextual cues are there to resolve the lexical ambiguity, but it takes longer for the subordinate reading to beat down the meaning with higher frequency. This pattern of input processing was indeed discerned in Experiment 2 where the subordinate reading of the experimental stimuli in a condition that provides disambiguation cue (i.e., the appropriate context) drew a faster response time than the other two conditions (i.e., inappropriate context condition and control condition).

Contextual Priming Effects

To further account for how L2 intralingual homographs were processed, a model of bilingual language processing known as the Bilingual Interactive Activation Plus (BIA +) Model (Dijkstra & van Heuven, 2002) can be exploited. According to the BIA + Model, for unbalanced bilinguals, whose L1 dominates their L2 (similar to the linguistic statues of the subordinate bilinguals in the current study), L2 lexical items have “a lower subjective frequency” compared to the L1 words; as a result, L2 representations will have “a lower resting-state activation” than those of L1. Specifically, in both monolingual as well as bilingual reading, lexical access is thought to be influenced by the sentential context in which the word has been used (Palma & Titone, 2020, p. 162). This influence of context in the disambiguation process of homographs (lexical access of the appropriate meaning) is what the BIA + Model explains. As Falandays and Spivey (2020, p. 34) maintain, contextual priming in the BIA + Model can affect the initial state of the system and consequently “bias it toward one meaning of an ambiguous word”. Such effect was indeed witnessed during the CMPT where subjects, regardless of their WMC group membership, demonstrated a faster response time to contextually appropriate stimuli than contextually inappropriate or control words.

In sum, as Assche et al., (2020, p. 57) put it, there exists an overwhelming body of empirical evidence pointing towards the fact that contextual cues help in the lexical disambiguation of homographs and homophones, at least to a certain extent. But this should not be interpreted as a complete ascendancy of sentential context over the automatic activation of multiple meanings, surely not the activation of highly familiar meanings of the words; in lieu of such an interpretation, it can be argued that contextual cues may help, to some extent, in the “subsequent inhibition of the unneeded, inappropriate meaning” (Assche et al. (2020)).

Creation of Semantic Links

One of the striking findings about establishing semantic priming effects in the present study came from one participant whose performance in Experiment 1 and Experiment 2 aroused the researchers’ interest. In order to elaborate on this case, some details on his performance seem pertinent. The case in question is a 34-year-old male whose performance in Experiment 1 suggested a low WMC (his WMC was measured at 0.41 to be precise. As mentioned before, the highest possible WMC in the reading span task used in the current study is 1.00). Based upon the data of the self-report LEAP-Q, his current exposure to English was amongst one of the lowest registered from the participants of the study; his exposure to English was reported as 10 per cent. Furthermore, he claimed that his language-learning journey started “late” and had been “sporadic”. One may argue that for a subordinate bilingual whose dominant language is their L1, such a report does not seem usual. Yet, compared to the rest of the participants, he performed quite badly in the CMPT despite the fact that participants with lower WMC outperformed him in reaction times and response accuracy for the experimental stimuli. In other words, his slower reaction times seem to spring from an inefficient creation of semantic links according to the spreading-activation model of semantic priming. This finding corroborates the study by Altarriba and Canary (2004) in that language experience and exposure assist in shaping semantic networks (see also Mcdonough and Trofimovich (2008, p. 70)).

Conclusion

The current study conducted two experiments investigating the effects of VWMC on the on-line language processing of Persian–English subordinate bilinguals in the context of novel ambiguous linguistic components. In light of the findings, certain points should be made. Firstly, with reference to the access and use of new semantic information in bilinguals, the findings indicate the allocation of attention between processing tasks will determine the propensity some L2 learners display for the effectual learning of certain L2 components as well as the inability of some to become autonomous in their L2 processing. Secondly, the type of bilingualism seems to dictate the timing of semantic activation in the brain, with those whose L2 is used for mediation taking longer to semantically activate related linguistic elements. In sum, coordinated bilinguals and compound bilinguals can potentially outperform their subordinated counterparts in such cognitive tasks. The said point is consequential, especially considering that previous studies in this context had not brought forward such empirical evidence which in turn underscores the novelty of the findings.

As for the pedagogical implications and applications of the study, investigating bilingual language processing can be considered a precursory element to crack the metaphorical jigsaw puzzle of learners’ L2 performance. Furthermore, understanding the underlying mechanism of language processing may help teachers and language practitioners to devise better and more efficient methods to teach their students. These points particularly hit the mark when one considers adults and late L2 learners whose second language learning journey has commenced later than usual and whose outcomes in language learning in general have proved less than satisfactory. Concerning L2 learners in general, teaching and learning strategies that maximize their chances of effective L2 acquisition should come to the fore. Learners with weaker memory skills or lower WMC may require more practice time and differential pedagogical instructions than those with higher WMC. To give an example, L2 immersion techniques utilized by many language teachers have demonstrated that students of higher working memory skills showed bigger L2 gains compared to their lower working memory counterparts highlighting that the interplay between cognitive characteristics of a language learner and instructional methods can dictate the degree of success concerning L2 skill acquisition (Traxler, 2011, pp. 437–438). Bearing that in mind can spare both learners and teachers a huge deal of frustration.

Some limitations of the study are worth noting. Notwithstanding the fact that attempts were made to include as many participants as practically possible, the ultimate sample size fell short of the ideal; the said limitation should be thought of in the context of taking logistics (including the difficulties caused by Covid-19 pandemic at the time of conducting the research) into consideration. Another limitation of the study appears to be the number of experimental stimuli and conditions seen by each participant which could have been more than it was. Be that as it may, in practice, instructing the participants through practice and main trials of the study appears trickier than what one thinks initially. This point strikes home particularly when subjects are of different ages and cognitive abilities; considering that all the trials of the study (i.e., the practice and main trials in addition to the learning phase) had to be responded to a via computer program with which none of the participants were familiar meant that the researchers faced a tall order. It is of great importance to note that the aforementioned limiting factors still need to be explored within the context of the current study in order to corroborate or rebut the findings reported here.

Based on what has been investigated in the current study, there are certain areas of inquiry in respect of the topics discussed here. In addition to making certain alterations in the methodology implemented by the researchers, new avenues of on-line bilingual language processing can be explored in future pieces of research. Of such avenues, one can focus on the encoding and retrieval of semantic information over a longer period of time compared to the timeline introduced by the researcher for the present study. It has been argued in the literature that semantic priming effects are transitory and evaluating the semantic priming effects when there is a delay between encoding of linguistic information and accessing the said information could possibly result in new or unexpected outcomes. Another area of research could be a change in the type of priming used during the CMPT. Even though the researchers utilized semantic priming in the CMPT in Experiment 2, repetition priming can arguably provide an intriguing alternative. The rationale behind such a claim can be that repetition priming has actually been used in CMPTs and for an evaluation of linguistic components that are considered new, repetition priming can serve as an interesting option.