Introduction

One of the challenges frequently faced by second or foreign language learners is second language comprehension, especially when it comes to agreement morphology (Chen 2009; Hahne et al. 2006; Jiang 2007; Johnson et al. 1996; Lardiere 1998; Long 1997; O’Grady 2006; Prevost and White 2000), such as subject-verb agreement, English plurals, etc. One of the prominent aspects of this agreement finding difficulty for L2 learners is anaphora. In other words, a very important component of comprehension process in L2 is the ability of the listeners or readers to integrate current information with the one mentioned earlier in the text, a process which has been referred to as anaphoric resolution (Cunnings et al. 2017; Pretorius 2005).

A review of the previous literature on the field (Elman et al. 2005; Frazier 1995; Roland and Hare 2012; Traxler 2012) reveals that various models have been proposed over time to shed light on the complex process of reading, which involves complex cognitive processes (Farmer et al. 2012). To refer to two of the most important models within this field, the two-stage serial model of Frazier (1995), and the constraint-based model of Altmann (1998) can be mentioned. The former model claims that comprehension occurs at two stages of lexical processing and thematic interpretation (Traxler 2012), all in a serial one-way fashion. The latter, challenges the assumptions underlying the two-stage serial model, and emphasizes the probabilistic and context-sensitive aspects of sentence processing and assumes that individuals use idiosyncratic lexical, semantic, and pragmatic information about the incoming words to determine an initial structural analysis of them (Altmann 1998).

It may be a truism to state that current models of sentence comprehension generally assume that during this complex process, readers use their linguistic knowledge, and their contextual and background knowledge to create a coherent mental representation of what the text is about. Central to this establishment of mental representation, as Pretorius (2005) noted, is the integration of incoming information with information given earlier in the text, and also the readers’ background knowledge.

This process, referred to as anaphoric resolution, enables the readers to integrate new incoming information with the already presented information, and being successful in doing so enables the readers to continuously update the mental text representation. Given the importance of anaphoric resolution, Webber (1980) noted that if readers fail to cope with anaphors and anaphoric expressions, they won’t be able to correctly update their discourse model in response to the difficulties posed by anaphoric expressions. In addition, the process of anaphoric resolution is critical for tracking referents in sentences, and maintaining topic continuity within a given text, which is regarded as an important component of text comprehension. Moreover, anaphoric resolution helps readers to understand written information more accurately. That being the case, it can be claimed that being able to successfully deal with and resolve anaphoric expressions, readers develop a more accurate and better understanding of the text as a whole. Hence, the analysis of anaphora and understanding the factors that contribute or inhibit the process of anaphoric resolution is of high significance in psycholinguistic research.

What is Anaphora?

Anaphoric devices are all lexical and syntactic devices that refer back to the information presented previously (Epoge 2012; Cunnings et al. 2017; Pretorius 2005). Put differently, in such cases, a referent that relates to an entity introduced within a text, is again referred to in later times within the same text, using the exact repetition of the entity or other linguistic forms such as pronouns. Hence, the anaphor refers back to its antecedent, and the reader’s job is to find this relationship faster and more accurately. Such antecedents may consist of a single word, a phrase, a complete sentence, an idea or event mentioned previously.

Anaphoric expressions in English may take different forms, such as pronouns, demonstratives, ellipsis, substitution, and lexical anaphors such as proper nouns, definite nouns, and noun phrases. Pretorius (2005) stated that anaphora can fall into two general categories, i.e. structural anaphora and textual anaphora. Structural anaphora is pertinent to those anaphoric expressions that refer back to information presented within a single clause or sentence; while textual or inter-sentential anaphora is related to those expressions that occur across sentences.

While resolving anaphoric expressions, readers need to engage in a sort of text-connecting task by trying to link an appropriate antecedent from several possible antecedents presented earlier with the anaphoric expression confronted in a sentence. During this process, sometimes being an arduous one for readers, their mental representations is constantly updated via receiving new incoming information. The difficulty that may arise for some readers is that the anaphoric expression may sometimes refer to the information that is implicitly presented and is recoverable from the context. As shown in the following examples, the anaphoric resolution process needed to comprehend example (1) may be quite a simple task for most readers, since the antecedent occurred explicitly; whereas in example (2), the resolution process may be difficult for some readers due to the implicit occurrence of the antecedent.

  1. 1.

    John bought himself a bicycle.

  2. 2.

    Hercules rode at a furious pace toward Thebes. The horse was exhausted on arrival.

In (2), the noun phrase ‘the horse’ does not have an explicit antecedent in the text, and it can only be understood from the context that Hercules had a horse and he was riding it. Such situations, as stated above, pose a lot of challenges on the way of L2 readers; however, a successful anaphoric resolution is a major source of cohesion in the texts. Put differently, the cohesiveness of sentences is largely dependent on the extent to which they contain linguistic items the interpretation of which heavily depends on the interpretation of previously mentioned items in the text.

Such being the case, anaphoric resolution as an inferencing process has been the focus of many studies, and many factors contributing to and inhibiting this process have been identified so far. Most of the studies dealt with anaphora, engaged in testing anaphoric resolution by means of reading time, response time or reaction time, as well as eye-tracking measures. The findings of the previous studies have shown several variables to be influential on anaphoric resolution. Among these variables, antecedent identifiability, topic continuity and the distance between the anaphor and the antecedent have been the most conspicuous in the literature (Pretorius 2005).

With regards to the identifiability of the antecedent, it can be stated that it is concerned with the ease with which an anaphor can be successfully referred back to its antecedent. Factors that might contribute to this variable include the overlap between the anaphor and the antecedent in terms of their features, and also their association strength. This variable has been tested in previous studies (Gallini and Spires 1992; McKoon and Ratcliff 1980; Gernsbacher 1989), and found to be effective on anaphoric resolution.

Another variable referred to above, is concerned with topic continuity, explored by a few studies (Gordon et al. 1993; Kennison and Gordon 1997). Regarding this variable, it has been indicated that skilled readers are usually more disrupted by repeated anaphors than reduced anaphors. In Gordon et al. (1993), the authors showed how reading times for repeated anaphors, such as name–name anaphors, was longer than reduced anaphors, such as name-pronoun or pronoun–pronoun.

The other conspicuous variable, focused on repeatedly in previous studies, is the anaphoric distance. It has been found that the most recent antecedent is taken to be the best match with the anaphor (O’Brien et al. 1997). It was found that in cases where the anaphor occurs immediately after the antecedent or is close to it, the information relevant to the antecedent remains in memory and it is more easily recovered while reading.

Other studies within this field have focused on other aspects of anaphoric resolution. For instance, Oakhill and Yuill (1986), and Ehrlich et al. (1999) focused on readers variables in their abilities to resolve anaphoric expressions, and found that readers with less reading skills are less successful at anaphoric resolution compared to skilled readers.

More importantly, it shouldn’t be neglected that many of the studies conducted within this field explored first-language readers rather than L2 readers, and research into anaphoric resolution among L2 readers may be scarce. However, in recent years, several research studies, including the present one, have addressed this gap in the literature, and investigated second language learners’ ability and the factors influencing it, to resolve anaphoric expressions successfully. Reviewing the literature, we can see that only relatively recently have researchers begun to investigate L2 learners ability in grammatical processing (Clahsen and Felser 2006; Dussias 2001; Felser 2005; Sabourin 2003). The significant point with respect to these studies is that, some of them have maintained that second language learners do not comprehend language in the same way as native speakers do. They suggested that these processing errors may be rooted in different factors.

One possibility, as suggested by Marinis (2003), is that second language comprehenders use inappropriate processing strategies that are transferred from their first language. Another reason for such processing error is that second language sentence processing is fundamentally different from first language processing in the sense that the syntactic structure computed in L2 is shallower than the one computed in L1, and also the comprehension of sentences by L2 learners is more dependent on lexical and pragmatic information, an idea proposed by Clahsen and Felser (2006) known as shallow structure hypothesis. Finally, another cause of L2 learners’ processing error is attributed to the idea that L2 learners may lack the automaticity that native speakers normally have; which may affect the speed of their lexical and morphosyntactic information retrieval (Ardal et al. 1990; Hahne and Friederici 2001).

One of the recent studies addressing L2 readers’ ability to handle anaphoric expressions is Pretorius (2005), in which the anaphoric resolution by first-year ESL students during reading expository texts was under investigation. In this study, the researcher explored the relationship between skill in anaphoric resolution, academic performance, and language proficiency. The findings suggested that learners with weaker academic performance were not skilled at resolving anaphoric expressions, and anaphoric resolution was also affected by linguistic and textual differences as well as language proficiency.

The Present Study

Building on previous studies, the present study is an attempt to focus on three variables that may influence L2 readers processing of anaphoric expressions, namely Working Memory Capacity (WMC), English Proficiency, and learners’ age. In other words, this study seeks to find out the extent to which the above-mentioned variables can predict the variation in second language learners’ ability to process and comprehend anaphoric sentences.

Working Memory and Language Learning

Research on WMC and its role in higher order complex activities, such as language learning is not a new field. Working Memory (WM) is defined as the ability of the learner to maintain information in an active and accessible state while simultaneously dealing with and processing new incoming information (Conway et al. 2007). From the above definition, it can be understood that WM has the two components of the dynamic processing of information and the storage of information in a readily accessible state (Baddeley 2007). Composite WM, as well as its separate components have been found to be strong predictors of different aspects of language learning. Before turning to the studies on working memory, it is necessary to explain the two main models of working memory briefly, i.e. the Multicomponent model (Baddeley 2000; Baddeley and Hitch 1974), and the single system model questioning the fractionation of WM into separate components. It should be noted that the conceptualizations of WM are not limited only to Baddeley’s (2000) Multicomponent model and Cowan’s (2008) Embedded-processes model, but there are other views of WM, such as the one which considers WM as an aspect of foreign language aptitude (Robinson 2005b; Skehan 2002), and the executive attention theory (Engle 2002; Kane et al. 2001; Kane et al. 2007), of which the two main models are explained here.

In his multicomponent model, Baddeley (2000) fractionated WM into four components: (1) central executive, an attentional control system with limited capacity, and (2) three limited-capacity slave systems, i.e. the visuospatial sketchpad, the phonological loop, and the episodic buffer. The central executive, as Baddeley (2010) states, is a processing system with the responsibility of allocating attention and the retrieval of long-term memory (LTM). The visuospatial sketchpad is conceptualized as a modality-specific area whose function is to temporarily store visual and spatial information; while the phonological loop is responsible for storing phonological/acoustic information. In the multicomponent model, the phonological loop and the visuospatial sketchpad are held as independent of one another, i.e. they draw from different resources, and they are also independent of the central executive processing. Finally, the episodic buffer has the function of temporarily linking WM storage systems with LTM since in this model the two systems are considered separate but connected.

Cowan’s (2008) embedded-processing model emerged as an alternative to the multicomponent model, questioning the idea of WM having separate modules. In this model, WM is conceptualized as a set of embedded processes from the activated part of LTM, which enjoys an unlimited capacity, and the focus of attention. In other words, the information in LTM can become accessible by being taken into WM, a process which happens as an embedded subset of information in LTM and is time limited. Within the activated portion of LTM, there is another embedded subset of activated information called the focus of attention which is very limited in capacity and can handle a small amount of information at any time (Oberauer and Kliegl 2006).

A large bulk of studies has thus far indicated the significant role of WM in human development, in general, and various aspects of language acquisition, in particular. In other words, WM was shown to be a strong predictor of successful cognitive tasks. WM has been discovered to be influential in general intellectual abilities, as in reasoning, as well as in general intelligence (Conway et al. 2003), in first language acquisition (Daneman 1991; Daneman and Green 1986), grammar learning (Robinson 2005a; Sagarra 2007), L2 novel vocabulary learning (Daneman and Green 1986; Martin and Ellis 2012), and particularly relevant to the present study in language comprehension (Just and Carpenter 1992; MacDonald et al. 1992; Swanson and Berninger 1995), and second language processing (Ellis 2005; Erlam 2005; Miyake and Friedman 1998; Palladino and Cornoldi 2004; Payne and Whitney 2002; Robinson 2002).

WM has been ascribed a significant role within the framework of language proficiency as well. Van den Noort et al. (2006) examined the interaction between WMC and language proficiency by testing on multilingual students. Stating that previous second language studies on simple and complex WM tasks produced mixed results, and that the languages used in such studies were from different linguistic groups, Van den Noort et al. (2006) set out to discover whether the interaction between WM and proficiency is language-specific. Using both simple and complex WM tasks in a controlled experiment, they found performance differences in the first, second, and third languages, supporting the idea that the interaction between WM and proficiency is language-specific, particularly in less proficient learners. However, it has been argued that in more advanced speakers of foreign languages, WM can be more language-independent (Kane et al. 2004; Mackey et al. 2010; Service et al. 2002; Trofimovich et al. 2007).

The majority of the studies mentioned above have indicated the significant role of WM in carrying out complex activities and language-related issues such as comprehension, production, grammar, vocabulary learning, foreign language proficiency, syntactic processing and the four language skills; however, there are studies that provide counterarguments regarding the role of WM in language learning, such as in syntactic processing (Caplan et al. 2007; Juffs 2005), and L2 vocabulary learning (Verhagen and Leseman 2016), lending support to the inconclusive nature of WM among researchers (Goo 2012).

As for the role of WMC in second language processing, it should be stated that the majority of the studies on this topic have addressed the role of WM in the processing of L2 morpho-syntax (Juffs and Harrington 2011), using instruments such as the self-paced reading technique (Juffs and Harrington 1995, 1996), and the reading span task (Daneman and Carpenter 1980). Tasks that have been used to measure WM in language learning studies, in general, and language processing-related studies in particular can be either simple or complex. Working memory, as stated previously, is characterized by both storage and processing components, which can be measured using simple tasks, i.e. only the storage component is measured, as well as complex tasks, i.e. the two components are measured in combination (Juffs and Harrington 2011).

Word span or digit span tests are good examples of simple tasks, in which sets of unrelated words or numbers are presented to individuals and should be recalled (Shah and Miyake 1996). Performance on word span tests, however, is dependent on individuals’ prior word knowledge, which may leave confounding effects, particularly in second language studies (Gathercole 1995). Digit span tests are less language dependent than word span tests as words are not used in them; however, they are still affected by previous language knowledge. Baddeley et al. (1998) solved this problem by proposing another test named non-word repetition task (NWR), in which individuals recall nonsense words.

Complex WM tasks (Colom et al. 2006), on the other hand, measure both storage and processing components. Such tasks have been used in different language related studies so far, measuring an overall working memory capacity. The most common complex tasks include the counting span test (Case et al. 1982), which requires individuals to keep track of colored dots; the reading span test (Daneman and Carpenter 1980), which measures overall WM by tapping into both WM storage and processing; the listening span test (Mackey et al. 2002, 2010), which is the spoken version of the reading span test; and finally the language-independent operation span test (Turner and Engle 1989), in which individuals are required to solve simple mathematical equations (tapping into processing component) and recall a string of letters (tapping into storage component of WM). This test has been widely used in several language related studies of WM and is accepted as measuring verbal WMC (Conway et al. 2003; Goo 2012; Tokowicz et al. 2004; Unsworth and Engle 2007).

In a series of studies, Juffs (2004, 2005, 2006) examined the extent to which WM could explain the reading speed differences (measured by self-paced reading tests) in three high cognitive demanding structures, namely, garden path sentences, long distance wh-movement, and sentences with temporary ambiguity between the main verb and the reduced relative clause. The results of these studies indicted no evidence for the role of WM in L2 processing.

More relevant to the present study, Rodriguez (2008) explored the role of WM in the reading time of three cognitively demanding structures, i.e. anaphoric reference, resolution of long-distance relative clauses, and resolution of verb-based ambiguity. Similarly, he didn’t find any significant role of WM in the reading time of these structures.

Given the somewhat surprising findings of Juffs (2004, 2005, 2006) and Rodriguez (2008), in which little effects of WM were reported, it can be asserted that the studies in the field of WM and second language processing have produced mixed results, and the role of WM in language processing is still inconclusive. In addition, little (if any) research has been conducted, thus far, examining the potential role of WMC, as an important individual difference, in the processing and comprehension of anaphoric sentences. More importantly, most of the previous studies on language comprehension and processing have mainly focused on composite WMC and neglected the potential role of its separate components, i.e. processing and storage, in language comprehension. That being the case, the present study is designed to address this gap in the literature by examining the relationship between composite WMC, as well as its two components, and L2 learners’ processing time and comprehension of anaphoric sentences. Two other variables are also included besides WMC, i.e. learner’s language proficiency and age to provide a clearer picture of the factors affecting learners’ ability in anaphoric resolution.

This experiment attempts to answer the following research questions:

  1. 1.

    To what extent can composite WMC, language proficiency, and learners’ age predict EFL learners’ processing and comprehension of anaphoric sentences?

  2. 2.

    To what extent can the processing component of WMC and the storage component of WMC predict learners’ processing and comprehension of anaphoric sentences?

Method

Participants

The participants of the present study were 40 male EFL learners (N = 40) studying English at different levels of proficiency at a language institute in Tehran, Iran. The reason for the inclusion of male students only was merely their availability in the context in which the study was conducted; in other words, the language institute did not have a co-educational context. The participants’ ages ranged from 13 to 30 years (M = 15.9). The levels at which they were studying at the institute varied from elementary to high-intermediate on the basis of the institute general proficiency levels. However, to get a more accurate assessment of their proficiency levels, they were given a standardized placement test. Prior to the administration of the tests, all the participants were informed that these tests were not a part of their routine classroom procedure, and that they were participating in a research study; therefore, those who volunteered to take part in all the procedures were selected as the participants of the study. Since, age and language proficiency were both two of the variables of the study, the learners were not homogenized in terms of these two variables.

Instruments

Anaphoric Test

The anaphoric test was designed by the researcher using the IBEX Farm sever, and contained 19 self-paced reading items. Of the 19 items, 14 items contained anaphoric expressions, all of which referred to an antecedent occurring prior to the anaphoric word, and 5 items were distractors containing no anaphoric expressions. It should be noted that the presentation of the anaphoric expressions and the distractors were randomized throughout the test. The participants were asked individually or in groups to take the test in the presence of the researcher. The time needed to answer the test items varied depending on their level of proficiency from 5 to 20 min.

Each participant was asked to type his full name and age in the introduction section before taking the test. The test items, as referred to above, were in the form of a self-paced reading, in which the participants were supposed to read each word in a sentence by pressing the space button on the keyboard. After reading the sentence, a comprehension question would automatically appear on the screen to which the participants had to reply. These questions served the function of ensuring the researcher that the participants really focused on meaning of the test items and comprehended them; rather than pressing the space button quickly just to finish the test faster. In addition, they served the function of measuring learners’ comprehension of the anaphoric sentences, since answering them would require learners to have a successful anaphoric resolution.

The results of the test administration were sent to the server automatically, and saved in a file for further analyses. Among the vital results the server provided was the measurement of the reading time for each individual word in a sentence and the whole sentence in milliseconds, which can provide the researcher with useful information on the processing of each word. Below is a sample of the items used in the self-paced reading anaphoric test:

Fred asked Joe to pass him the potatoes.

Which sentence is true?

  1. 1.

    Joe wanted some potatoes.

  2. 2.

    Fred wanted Joe to eat some potatoes.

  3. 3.

    Fred wanted to have some potatoes.

Scoring of the Anaphoric Test

The score provided by the test for each individual is calculated on the basis of the average time spent by each participant to read all the anaphoric words. After the administration of the test, the results obtained for each individual is analyzed separately, and the average time they spent for reading all the anaphoric words was calculated. This average time in milliseconds was considered as the score of anaphoric expressions processing for each participant of the study, which was assigned to the SPSS software for data analyses. Moreover, the number of the correct responses to the comprehension questions was added up to obtain anaphoric expressions comprehension scores. This score ranged from zero (i.e. no correct answer was given to the questions) to 14 (i.e. all comprehension questions containing anaphora were answered correctly). Therefore, the anaphoric test provided two sets of scores, one for processing of anaphoric expressions, and the other for their comprehension.

Operation Span (OSPAN) Task

An OSPAN task was used to measure the participants’ working memory capacity (WMC) prior to the administration of the anaphoric test. An OSPAN task is considered a complex span measure that has been widely used as tapping verbal WMC (Kane et al. 2004; Tokowicz et al. 2004; Unsworth 2007; Unsworth and Engle 2007). The reason for using an OSPAN rather than a Reading Span task is to minimize the dependence of the test on prior language knowledge (Turner and Engle 1989). This task requires the operation of WM by making participants maintain linguistic items (e.g. alphabet letters) in their memory for later recall while solving math problems. Within the context of the present study, the OSPAN task that was modified from Unsworth et al. (2005) was developed and employed, using Microsoft PowerPoint Version 2010. A total number of 28 mathematical operations (such as (2 × 3) + 1 = ?) were generated. In addition, the Persian consonants were used as recall items (i.e. to-be-remembered items) and were equally distributed across the sets of operations (not twice in the same set). The math operations were distributed in sets of 2, 3, 4, or 5, with a total number of two sets for each set size; i.e. there were two groups of operations each containing a set of 2, 3, 4, and 5 sets of operations. The set sizes were randomized so that the participants could not guess the number of cues they had to remember.

During the actual task, the participants silently read each mathematical operation on a slide that was shown for 3000 ms, which was followed by another slide that displayed a number as a possible answer to the operation and also the options Yes or No. The participants were given 3000 ms to decide if the number provided on the slide was a correct answer to the operation and to put a check mark on the answer sheet under Yes or No columns.

Following this slide, a Persian consonant appeared, which was displayed only for 1000 ms, and it was automatically followed by another mathematical operation. When the set was finished, for instance a set of 2 operations contained only two operations and consequently two Persian to-be-remembered consonants, a recall cue (i.e., ???) appeared on a new slide at the end of each set, which necessitated the participants to recall the consonants shown in the preceding set and write them down in the order of presentation on the other side of the answer sheet. It should be noted that for the recall part of the task and writing the consonants, they were given 4000 ms for set size 2; 6000 ms for set size 3; 8000 ms for set size 4; and finally 10000 ms for set size 5. A 2-set OSPAN task is presented in Table 1 as an example in the order it was presented in the Power Point Slides.

Table 1 Example of a set of two OSPAN task

Scoring of the WMC Test

In order to score the WMC test, 1 point was given for each correct answer to the mathematical operation, but no point for an incorrect answer. As for the recall part of the WM span measure, 1 point was awarded for each correctly recalled item in its correct serial position within a set, and no point for an incorrectly recalled item. Therefore, the total possible score for a set of 2 operations would be 4 points, i.e. 2 points for the two mathematical operation questions, which forms the processing component of WMC, and 2 points for the two consonants that were to be recalled, which is the storage component. In the same way, a set of 3 would give a score of 6 points, a set of 4, 8 points, and finally, a set of 5 operations would give us 10 points. Therefore, the score of the OPSPAN task for the composite WMC would range from zero to 56 points. The closer an individual’s score to 56, the higher span of WMC he/she has. In addition, the score of the processing and storage components of WMC for each participant would range from zero to 28.

Quick Placement Test (QPT)

In order to assess the participants’ level of English proficiency, the standardized Quick Placement Test was administered prior to the anaphoric test. The test was designed by University of Cambridge Local Examinations Syndicate (2001), and composed of two parts. Part one consisted of 40 items measuring grammar, reading and vocabulary knowledge of the participants, and the time allowed for answering the items was 30 min. Part two consisted of 20 items, and the time allotted for answering the items was 30 min. Overall, the time required to take the test was 60 min according to the instructions of the test designers. Based on the scoring instructions of the test, participants whose scores lie between 0 and 17 are at the level of A1, those with scores between 18 and 39 are at the level of A2, between 40 and 47 at the level of B1, between 48 and 54 at the level of C1, and the participants with scores between 55 and 60 lie at the C2 level of proficiency. Based on this rating scale, A1 learners are beginner learners whose proficiency is very low; while C1 and C2 learners are at the advanced level of English proficiency.

Procedure and Data Analysis

Forty EFL learners at different proficiency levels participated in the study. They all volunteered to be a part of this study. Subsequently, they were informed about the procedures of the study and what they were expected to do during test administrations; however, great care was taken so as not to give away the purpose of the study.

In the next step, a WMC test (Operation Span Test) was administered. The time required for the completion of the test was about 5 min. The test was designed using Microsoft Power point Version 2010. The participants took the test either individually or in groups depending on the facilities available in the classrooms. The scores, as described above ranged from zero to 56, and were saved and assigned to SPSS software for statistical analysis. The participants were also given a placement test (Oxford Placement Test) to measure their proficiency level prior to the anaphoric test.

The anaphoric test was administered after the WMC and placement tests were given. The participants were asked to answer 19 self-paced reading items. The test, as referred to previously, was designed by the researcher using the IBEX Farm server. All the calculations of the reading time for each sentence and for each individual participant were automatically carried out by the IBEX server.

Several Multiple Regression Analyses, using SPSS V.21, were run to determine the extent to which the independent variables (i.e. composite WMC, the processing and storage components of WMC, language proficiency, and age) predict the variation in learners’ processing and comprehension of anaphoric sentences.

Results and Discussion

Prior to running the Multiple Regression Analyses, all the assumptions of this statistical test such as normality of distribution, linearity, homoscedasticity, and no multicollinearity were checked. Since the composite WMC had a high correlation with its processing component (r = 0.82, p < 0.05), as well as its storage component (r = 0.84, p < 0.05), the multicollinearity assumption was not met; therefore, the two components were removed from the regression analysis, and analyzed in two separate regression analyses for the anaphora processing and comprehension. The other independent variables were not highly correlated; hence, the assumptions of multicollinearity, along with others were met. Table 2 indicates the descriptive statistics for the variables of the study.

Table 2 Descriptive statistics

The results of the regression analyses (Table 3) revealed that 36.5% of the variance in the first dependent variable, i.e. processing of anaphoric expressions, could be accounted for by the independent variables of the study [F(3, 36) = 6.89, p < 0.05)]. As for the second dependent variable, i.e. comprehension of the anaphoric sentences, the regression results indicated that 70% of the variability in participants’ comprehension of the anaphoric sentences could be explained by the three variables of proficiency, age, and WMC [F(3, 36) = 28.25, p < 0.05].

Table 3 Total variance prediction

Among the independent variables, as shown in Table 4, composite WMC (beta = − 0.664, p < 0.05) could significantly predict the variability in learners’ anaphoric processing; however, language proficiency and learners’ age were not found to be significant in this regard. In addition, composite WMC could significantly predict variations in learners’ comprehension of anaphoric sentences (Beta = 0.88, p < 0.05); whereas, age and proficiency didn’t make a significant contribution in this regard.

Table 4 Multiple regression analyses results for the processing and comprehension of anaphoric expressions

With regard to the above findings, it can be stated that learners’ WMC can better predict the variation in anaphoric expressions processing and comprehension, while the other two variables (i.e. age and proficiency) were not significant predictors of processing and comprehension of anaphoric sentences, as their beta values were relatively lower. This result can be taken as empirical evidence on the significant role of learners’ WMC in their ability to process and comprehend sentences containing anaphoric expressions.

The second research question was intended to examine the extent to which the two components of WM, i.e. WM storage and WM processing, could separately predict learners’ processing and comprehension of anaphoric sentences. To this end, two separate multiple regression analyses were run, with the WM processing and storage as the predictors, and anaphoric expression processing and comprehension as the outcome variables. The results are displayed in Tables 5 and 6.

Table 5 Total variance prediction
Table 6 Multiple regression analyses results for the processing and comprehension of anaphoric expressions

The results of the analyses show that 36% of the variation in anaphoric expressions processing can be predicted by variation in WM processing and WM storage [F(2, 37) = 10.76, p < 0.05], with the WM processing having a greater contribution [Beta = − 0.4, p < 0.05]. In addition, the contribution of WM storage is also significant (Beta = − 0.3, p < 0.05); however, compared with WM processing, it plays a smaller role in the processing of anaphoric expressions.

As for the comprehension of sentences with anaphora, it is found that 72% of the variation in this variable can be accounted for by WM processing and WM storage [F(2, 37) = 47.6, p < 0.05], with the WM processing component having a much greater contribution (Beta = 0.6, p < 0.05). WM storage component is also a significant predictor of anaphoric sentences comprehension (Beta = 0.3, p < 0.05); however, its effect is smaller compared to WM processing.

In sum, for the first research question, on the extent to which composite WM, language proficiency, and learners’ age can predict EFL learners’ processing and comprehension of anaphoric sentences, the results of the multiple regression analyses indicated that composite WM, measured through a complex OSPAN WM task, played a significant role in the prediction of the variation in both anaphoric processing and comprehension; while language proficiency and learner’s age did not have any significant contribution in this regard. With regard to the second research question, on the extent to which the two components of WM, i.e. WM processing and WM storage, can predict variation in the two dependent variables, the results of the multiple regression analyses revealed that WM processing played a significant role and accounted for much of the variation in anaphoric processing and comprehension; while WM storage, although had a significant contribution, played a smaller role in this regard.

These findings can be explained in terms of the mechanisms through which learners carry out the anaphoric resolution. It is generally assumed that during the process of reading, readers use their linguistic, text knowledge and background knowledge in order to create a mental representation of what the text is about (Grabe 1991). According to Pretorius (2005), a critical factor in the creation of this mental representation is the integration of new information with the one already given in the text, a process which forms the very basic foundation of anaphoric resolution. The question that comes to mind is that how are the learners able to integrate incoming information with previously given information? In other words, what factors can play roles in this ability of the students? As Pretorius (2005) and Webber (1980) have noted, learners can do this by identifying and tracking referential continuity, focusing on the text and the shifts in the focus of the text.

Another factor that can be arguably added to these factors is individual differences in their WM, which is found to influence the extent to which learners can track referential continuity and antecedents. In other words, the higher WMC an individual possesses, the faster he/she can resolve the anaphoric expression, find the antecedent, process the text, and comprehend it. Considering the definition of WM, we can establish a link between this variable and anaphoric resolution. WMC has been defined as “the ability to maintain information in an active and readily accessible state, while concurrently and selectively processing new information” (Conway et al. 2007, p. 3). This issue has been considered a vital cognitive construct in human cognition, and it was found to be significantly correlated with higher order cognitive behaviors such as reading, reasoning, and also intelligence (Goo 2012).

It can be argued that if working memory is the ability to maintain information in an active state while simultaneously processing new pieces of information, then, it does affect anaphoric resolution since the same process is involved in doing the latter. In other words, in anaphoric resolution, readers need to maintain previously given information, such as different candidates for the antecedent, and concurrently process the new incoming information, which is the anaphoric word in this case, and eventually process and relate the anaphoric word to its antecedent. Hence, the anaphoric resolution may be affected by WMC, which was in fact corroborated in the present study.

The study also revealed that the two variables of age and proficiency are not good predictors of anaphoric test scores in neither processing nor comprehension. Hence, it can be concluded that one can expect students at different ages to have varied anaphoric resolution abilities while reading. The results of the Pearson Correlation between age, on the one hand, and anaphora processing scores (r = − 0.165, p > 0.05), and comprehension scores (r = 0.296, p > 0.05) on the other, corroborate this claim. The same conclusion can be drawn regarding the level of students’ language proficiency. It should be noted that while proficiency can indeed be directly correlated with anaphora processing (r = − 0.541, p < 0.05) and comprehension (r = 0.746, p < 0.05), it does not relatively have a powerful prediction ability compared with WMC.

As for the separate components of WM, both were found to be significant predictors of anaphoric processing as well as comprehension; with the WM processing having a greater contribution in this regard. It seems that during the process of anaphoric resolution, in which learners need to find the right antecedent, both the temporary storage of information, i.e. the possible antecedents in a given sentence, and the processing of information, i.e. processing of the incoming information to comprehend the message until the right antecedent is found, play major roles in anaphoric resolution; however, the whole process mostly depends on the WM processing component rather than its storage counterpart. These findings are not in line with those of Juffs (2004, 2005, 2006), and Rodriguez (2008), in which little role of WM in language processing was indicated.

The findings of the study have implications for foreign language teachers and students. Teachers need to be aware of the fact that learners’ WMC, as a cognitive individual difference, can play significant roles in their reading ability and comprehension, as well as resolving anaphoric expressions. Learners also need to be cognizant of the mechanism through which they can find the proper antecedents for the anaphoric word by knowing that they need to enhance their WMC and practice maintaining different possible antecedents in mind while reading and comprehending sentences. As we know, this skill is one of the most important ones needed for reading achievements and success in language tests.

However, there were a few limitations with which the researcher had to deal during the data collection procedure. One of the limitations of the study was the limited number of participants for a language-related psychoanalytic study. Unfortunately, due to problems in accessing more students over a 3-week period of time, it was not possible for the researcher to include more participants to the study. Moreover, due to the unavailability of the necessary equipment, the participants could not be screened for neurological, psychological, and hearing/sensory issues; however, the potential intervening influence of any relevant medical issues was screened out by simply asking the participants or their parents about their medical history, particularly with regard to any previous hearing problems or whether they were under the influence of any medicine which may jeopardize their attention and concentration.

Another limitation of the study was related to the development of the anaphoric test. The test was designed on the IBEX web server, and there were several obstacles in analyzing and scoring the test results in milliseconds. Unfortunately, the researcher had to analyze the average time spent by each individual for reading and processing the anaphoric words manually, which was a very time-consuming process.

The last limitation was related to the administration of the WMC test which had to be carried out using the Microsoft PowerPoint software; however, due to limitations in the facilities of the institute where the data was collected, the researcher had to give the test to students individually, which was also a time consuming process.