Introduction

Willingness to communicate (WTC) has been considered a contextual and individual difference variable in applied linguistics. The concept of WTC was imported from the literature on L1 communication to investigate the factors influencing effective communication in L2. The core question underlying L2 WTC research might be stated simply as: Why do some language learners are more willing to use the L2 more than others? The answers to that question have become increasingly complex over the years, prompting the present research which aims to synthesize the empirical research. Despite existing insightful narrative reviews surrounding L2 WTC and the influence of its predictor variables, so far no formal meta-analysis synthesizing the findings of individual studies has been conducted. Hence, the current paper is a meta-analysis of predictors of L2 WTC, with English as the second language. The purpose is to test the correlation strength between L2 WTC and each of the three correlates theoretically proposed and empirically proven to be the key influencers of WTC in previous studies.

Willingness to Communicate

Originally, L1 WTC was considered to be “a personality-based, trait-like predisposition”(McCroskey and Richmond 1991, p. 134). When MacIntyre et al. (1998) introduced a model of WTC for second langauge acquisition, they developed a second conceptualization of WTC as reflecting “a readiness to enter into discourse at a particular time with a specific person or persons, using a L2” (p. 547). MacIntyre et al. (1998) offered a conceptual pyramid-shaped model including a range of linguistic, contextual, and psychological variables influencing L2 WTC, organized along a continuum from distal to proximal influences. The model integrates six levels or layers of conceptualization. The enduring influences begin with a base in learner personality and intergroup processes, moving through the cognitive, affective and social contexts in which learning occurs, then to motivational propensities and enduring self-confidence with the language (representing a combination of perceived competence and low anxiety).

More situated influences appear at the top of the pyramid including a state of self-confidence in one’s ability to communicate in a specific situation at a specific time, and the desire to communicate with a specific person in a specific location. As a situated state, WTC reflects the integration of a large number of interacting internal and external processes and can be considered the final psychological step preparing a person to use the language they are learning. Larsen-Freeman (2007) astutely observed that “it is not that you learn something and then you use it; neither is it that you use something and then you learn it. Instead, it is in the using that you learn—they are inseparable” (p. 783). MacIntyre et al. (1998) considered WTC as a way of talking in order to learn, reflecting a concern for the integrated processes of learning and communication. From this view, the concept of WTC can be closely associated with the interactionist perspective (Gass 2003; Gass and Mackey 2006, 2007; Gass et al. 2005) putting emphasis on the role of meaning negotiation during the process of L2 interaction. Considering the complex nature of the association between interaction and learning, researchers have focused on the specific features that impact up on the formation of meaning during the process of L2 interaction (e.g. Long 2007; Mackey 2012; Mackey et al. 2012; Mackey and Goo 2012) which might explain the indispensable association between using and learning a second language underpinning the concept of L2 WTC. On the other hand, individual differences are firmly tied to the theoretical foundations of the interaction approaches, providing teachers and students with opportunities to concentrate on meaning and communication that consequently lead to emerging interactive learning processes and facilitate language acquisition (Long 2015). One of the influential individual differences is WTC.

Whereas the higher layers reflect immediate situational influences form the top of the pyramid, the lower layers reflect enduring influences, the relatively stable factors that cut across many communication situations (MacIntyre et al. 1998). On the one hand, the shape of the pyramid model, moving upward from distal influences to more proximal ones, reinforces the conceptualization of WTC as an emergent state of mind that reflects dynamic fluctuations in the situation as well as within the learner himself or herself (MacIntyre and Legatto 2011). On the other hand, an enduring level of WTC can be considered to be the ultimate aim of language learning in courses or programs via the linguistic and communicative tools required for communication. Although it is not shown explicitly in the pyramid model, the literature on L2 WTC has indicated that the trait-level construct is predicted by enduring variables. It must be emphasized that conceptualizations of the trait and state levels of WTC, addressed by different timescales, are considered complimentary (Cao and Philp 2006; MacIntyre et al. 1999).

Given the flexibility in the definition, L2 WTC has been investigated by researchers using a variety of qualitative, quantitative, and mixed methods (Baran-Łucarz 2014; Cao 2011; Cetinkaya 2005; Elahi Shirvan and Taherian 2016; Khajavy et al. 2016; MacIntyre and Legatto 2011; Mystkowska-Wiertelak and Pawlak 2017; Pawlak et al. 2016; Peng 2007). However, the bulk of research on L2 WTC thus far has focused on the trait-level, using quantitative methods and questionnaire measures of L2 WTC. From the trait-level perspective, L2 WTC has been linked to numerous variables which are at the social and individual levels, such as personality (Ghonsooly et al. 2012; MacIntyre and Charos 1996), motivation (MacIntyre et al. 2001; Peng and Woodrow 2010), international posture (Ghonsooly et al. 2012; Yashima 2002; Yashima et al. 2004), self-confidence (Baker and MacIntyre 2000; Ghonsooly et al. 2012; MacIntyre et al. 2001), age and gender (MacIntyre et al. 2002a, b). Based on the literature on L2 WTC, we can infer that the three main correlates of this construct which have been examined most frequently by the researchers studying L2 WTC are perceived communicative competence, (lack of) language anxiety and motivation.

Despite the fact that numerous studies have highlighted the crucial role of L2 WTC in the acquisition of English language as an L2, the strength of the WTC associations with its correlates has been rather inconsistent across studies. In other words, the lack of a quantitative meta-analysis of the existing results is observed in the literature. Thus, a meta-analysis exploring the power of the association effects between L2 WTC and its correlates is needed. The application of meta-analysis in research can contribute to the researchers’ making sense of the challenging body of research with a higher level of precision that the traditional literature review approach cannot provide. Furthermore, it can provide researchers interested in L2 WTC with insights into identified patterns as well as trends in and associations among various research findings (Plonsky 2014).

Moreover, one of the main problems in the analysis of the main quantitative studies conducted in the field of applied linguistics is the “power problem” (Plonsky 2013, p. 678) and this applies to the studies carried out on L2 WTC. Some evidences for these power problems are “extremely rare use of power analyses in order to inform sampling decisions” and “heavy reliance on null hypothesis significance testing (NHST)” (Plonsky 2015, p. 29). Since the desired level of statistical power, “the probability of observing a statistically significant relationship” in the social sciences is .80 (Plonsky 2015, 29), to determine the sample size for this statistical power, the anticipated effect size should be brought to the equation. One of the main sources to obtain this value of estimated effect size in L2 WTC research is a meta-analysis on L2WTC research (Plonsky 2015).

The present meta-analysis focuses on three correlates that have been examined most frequently by the researchers studying L2 WTC, namely perceived communicative competence, (lack of) language anxiety and motivation. Following Jeon and Yamashita (2014),Footnote 1 we regard these three variables as high-evidence correlates. In addition to these three variables, there exists other correlates that are less frequently examined by researchers. They include attitude toward L2, attitude toward learning, proficiency level, ideal L2 self, ought-to L2 self, and ratings of listening, speaking, reading and writing proficiency; which will be referred to also evidence correlates (Jeon and Yamashita 2014). It should be noted that although these low-evidence correlates might be significant in WTC, the number of available effect sizes in the literature for meta-analysis is limited at the present time. For this reason, sufficient statistical information for producing trustworthy interpretations is not yet available and therefore meta-analysis of the low-evidence correlates would be premature.

Review of L2 WTC High-Evidence Correlates

The original empirical studies of L1 and L2 WTC separated perceived communicative competence and communication apprehension (anxiety), and for the present study we will also retain the distinction between them. We recognize that some L2 studies have combined perceptions of competence and low anxiety under the higher-order construct of “self-confidence” (Clément 1986). However, for the purpose of reviewing the literature, as it has been reported, and estimating the contributions of the high evidence correlates to the prediction of WTC, it is best to estimate separately the relationship between WTC and both perceived competence and language anxiety.

Perceived Communication Competence

Among the predictors of L2 WTC, perceived communication competence appears to have a strong connection to developing a willingness to initiate communication (Kim 2004; MacIntyre et al. 1999; MacIntyre and Charos 1996; Onwuegbuzie et al. 2000; Sparks and Ganschow 2001; Yu 2008). Perceived self-competence reflects the learners’ self-assessment of their competence and it has been argued that the perception of competence might have a greater impact on L1 communication, as compared to the effect of actual ability (McCroskey and Richmond 1991). Researchers have argued that the most important predictor of one’s level of WTC is perceived communicative competence (MacIntyre 1994; MacIntyre et al. 1998, 1999, 2002a, b; MacIntyre and Charos 1996; McCroskey and Richmond 1991; Yu 2008). Most of these studies found that the higher the level of perceived communicative competence, the higher the level of WTC. Perceived competence can be evaluated by a scale with 12 items proposed by MacIntyre and Charos (1996).

Providing further evidence to this argument, Yu (2008) conducted a study with 235 second year and third year university students studying English at a Chinese public university. The results demonstrated that self-perceived communicative competence had a positive relationship with WTC in both first and second languages (r = .53 for Chinese, r = .50 for English), suggesting that the learners with higher level of self-perceived communicative competence were more willing to communicate.

Likewise, Baker and MacIntyre (2003), found that self-perceived communicative competence was the only variable significantly correlated with L2 WTC for the non-immersion students. Also, MacIntyre et al. (2002a, b) reported that the only significant predictor of L2 WTC was self-perceived communicative competence for students who had no background immersion experience.

Language Anxiety

It is now well established that foreign language anxiety is one of the strongest predictors of WTC in L1 and L2 (Cetinkaya 2005; Hashimoto 2002; Kim 2004; Knell and Chi 2012; MacIntyre et al. 2002a, b; Wu and Lin 2014). MacIntyre et al. (1999) defines language learning anxiety as “worry and negative emotional reaction aroused when learning or using a second language” (p. 27).

Early research on language anxiety attempted to differentiate its potential to facilitate learning from its debilitating effects (Scovel 1978). Although it is possible that some anxiety might provide arousal needed for an otherwise bored student to engage with a task (MacIntyre et al. 1999), most often the empirical research shows language anxiety is negatively related to L2 performance. It must be emphasized that researchers such as Horwitz et al. (1986) argue strongly that language anxiety is a situation-specific type of anxiety that language learners experience while performing a task, and; therefore, it is not appropriate to use non-language-related anxiety concepts (such as neuroticism, test anxiety or even fear of bugs) in the present meta-analysis .

Research has found significant correlations between language anxiety and L2 WTC (Cetinkaya 2005; Ghonsooly et al. 2012, 2013; Hashimoto 2002; Kim 2004; Knell and Chi 2012; Wu and Lin 2014). Also, studies have consistently found that communication apprehension is strongly related to self-perceived communicative competence in the first language (MacIntyre 1994) and foreign language communication (Hashimoto 2002; MacIntyre and Charos 1996).

It should be mentioned that researchers studying communication in the L1 agree that communication anxiety and self-perceived communicative competence are the two main factors that could influence one’s WTC (MacIntyre 1994; MacIntyre et al. 1999). However, studies have indicated that the relative extent to which the two variables influence learners’ WTC is different (MacIntyre et al. 1999; MacIntyre and Charos 1996). MacIntyre and Charos’ (1996) study was conducted among beginner learners of French in the bilingual city of Ottawa. The correlational coefficients indicated that perceived communicative competence was more strongly correlated with WTC than with language anxiety and the path analysis showed that the impact of these two variables on WTC to be equally strong. Moreover, they found that communication anxiety directly influences students’ perceived competence.

Additionally, Yashima (2002) and Yashima et al. (2004) found a strong correlation between EFL (English as a Foreign Language) learners’ perceived communicative competence and WTC compared to the correlation between language anxiety and WTC. The findings of Hashimoto’s (2002) study carried out in a Japanese ESL (English as a Second Language) setting were similar to the previous ones, with a stronger relationship between perceived communicative competence and WTC than language anxiety and WTC. Although their study was conducted in an immersion context, MacIntyre et al. (2002a, b) also found that perceived communicative competence had a stronger correlation with WTC than language anxiety.

However, in some set of studies, it was found that language anxiety had a stronger relationship with L2 WTC than perceived communicative competence. Baker and MacIntyre (2000) examined two groups of university students in the bilingual Canadian context; the first group had immersion experience but the second group had only learned French as a second language (FSL). In the first group, the only correlation was between communication anxiety and WTC; but for the second group, WTC was only associated with perceived communicative competence. Moreover, having explored the role of gender and immersion in communication, they found that the WTC of those participants who took part in immersion programmers correlated only moderately with language anxiety; and the WTC of non-immersion students presented significant but weak correlation with language anxiety (r = -.29, p < .01) but strong correlation with perceived competence (r = .72, p < .01). However, the correlation between second language anxiety and L2 WTC was slightly weaker in the non-immersion than in the immersion group.

As these studies showed, the impact of communication anxiety and perceived competence on one’s WTC may be associated to some extent with the learning context or language experience. Consistent with McCroskey and Richmond’s (1991) theory, with regular language use, language anxiety had a larger effect on learners’ WTC. However, in foreign or second language contexts where language use appears to be less frequent, the learners’ perceived communicative competence was more consistently correlated with their WTC.

Motivation

In SLA, motivation has been the most explored individual difference variable to date, and it has been shown consistently that learners’ motivation correlates positively with their target language proficiency (Gardner 1985). Motivation is considered to be an internal property of the learner that can be influenced by outside factors. It is the driving force which paves the way for more effortful and efficient learning, affecting both the rate and success level of acquisition (Dörnyei 1998). According to Gardner (1985), motivation refers to an amalgamation of the learner’s desires, attitudes, and efforts which encourage them to learn the target language. It should be noted that motivation has been operationally defined based on different theories in the field of applied linguistics so far.

Many studies show that motivation has a positive correlation with WTC. Thus, language learners with higher levels of motivation tend to be more willing to communicate (Cetinkaya 2005; Knell and Chi 2012; Liu and Park 2012; Peng and Woodrow 2010; Wu and Lin 2014). MacIntyre and Charos (1996) used path analysis to predict success in second language learning and communication. They found significant relationships between language learning motivation and WTC in the second language. Hashimoto (2002) used MacIntyre et al.’s (1998) model of WTC and examined the effectiveness of applying affective variables such as motivation and second language anxiety to predict Japanese ESL students’ WTC within a classroom setting. He found a significant relation between WTC and motivation, and concluded that WTC probably had motivational features (Hashimoto 2002).

A large number of studies have indicated that motivation can indirectly predict WTC, exerting its effects by influencing communication confidence(Cetinkaya 2005; Ghonsooly et al. 2012; Khajavy et al. 2016; Peng and Woodrow 2010; Yashima 2002). These results suggested that despite close association of motivation with L2 WTC, learners experiencing higher levels of language learning motivation may not necessarily experience higher levels of WTC (Peng and Woodrow 2010).

Studies in immersion settings have suggested that motivational factors have a vital role in affecting one’s WTC, as did studies carried out in foreign language or second language context. Baker and MacIntyre’s (2000) study examines differences in immersion versus non-immersion students’ motivation in relation to the communication variables. In this study motivation was measured by the Guilford version of Gardner’s attitude/motivation test battery (AMTB) (see Gardner and MacIntyre 1993). The result showed that there was a positive correlation between learners’ motivation and second language WTC in both groups. However, the correlation between the two variables was stronger among the immersion students (r = .61, p < .01 as opposed to r = .38, p < .01 in non-immersion group). MacIntyre et al. (2002a, b) also reported significant correlations between motivation and WTC.

The Present Study

The above literature review suggests that the three high-evidence correlates of trait-level L2 WTC, perceived competence, language anxiety and motivation appear to be reliable predictors. The present study reports a meta-analysis that will provide a statistical summary of how strongly L2 WTC is correlated with these three predictors. Meta-analysis provides a quantitative synthesis of the strength of the relationships while explicitly taking into account various features of individual studies that might affect the reported results, providing an advantage over a narrative review. In addition, meta-analysis can estimate the robustness of the relationships by considering estimating the potential impact of “file-drawer” studies, those less likely to be published due to non-significant correlations, on the literature.

The present meta-analysis addressed the following research question:

What are the relative strengths of association between L2 WTC and perceived communicative competence, language anxiety, and motivation?

Method

Inclusion Criteria

Five main databases (LLBA, ERIC, Psych INFO,SCOPUS, and Google Scholar) and major journals in the area of second language acquisition (e.g., Studies in Second Language Acquisition, Applied Linguistics, Journal of Applied Linguistics, Foreign Language Annals, Language Learning, Canadian Modern Language Review, International Review of Applied Linguistics, Language Teaching Research, Language Testing,The Modern Language Journal, Second Language Research, System, TESOL Quarterly) were searched to locate relevant studies that included L2 WTC. Regarding temporal parameters, studies published between 2000 and 2015 were considered for inclusion. This period of time was selected because the WTC pyramid paper was published by MacIntyre et al. (1998) and the year 2000 roughly corresponds to the initial published investigations of WTC in second language context. Fifteen years of research provides sufficient data to examine changing trends and predictive factors of WTC across the 2000s, up to the end of 2015, which was the time of data collection of this study. Published material was located via the use of different wording patterns of the following key terms: WTC, language anxiety, perceived communicative competence, motivation, attitude, predictors, components, and analysis. Full texts of the articles were chosen for more investigation, if abstracts included correlational design. In general, 64 studies were selected. Based on the required information for conducting a meta-analysis, we checked all the studies one by one to select them for our meta-analysis.

The criteria defining the body of research for meta-analysis must be first broad enough to yield firm findings across a number of studies; and second, research must be conceptually and adequately narrowed to “avoid inappropriate aggregation of findings” (Plonsky 2011, p. 1002). For the present meta-analysis, studies were reviewed using the following inclusion criteria:

  1. 1.

    The study was published in a peer reviewed journal between 2000 and 2015.

  2. 2.

    The study was a dissertation completed between 2000 and 2015.

  3. 3.

    The study examined English language that was either a second or foreign language for the participants.

  4. 4.

    Both WTC inside and outside the classroom were considered for meta-analysis.

  5. 5.

    The study included the necessary information for the calculation of effect sizes, including mean, standard deviation or variance, and sample size, for each group.

  6. 6.

    Concerning independent observations, studies which reported on duplicate samples were meticulously explored and just one study for each independent study was incorporated in the inclusion list.

The final sample included 22 studies, including both high-evidence and low-evidence correlates of L2 WTC comprising 60 independent effect sizes, 4794 total participants, and 18,631 participants considering all the samples of independent studies. Out of these 22 studies only 11 focused on the three high-evidence correlates of L2 WTC defined as perceived communicative competence, anxiety, and motivation producing 32 effect sizes and 8219 total participants (see Table 1).

Table 1 The individual studies and the independent correlations used in the meta-analysis

Analytical Procedures

We coded all of the studies for several features (e.g. perceived communicative competence, anxiety, and motivation) based on a pilot-tested protocol that demonstrated perfect inter-rater agreement. To ensure coding accuracy, approximately 25% of the information collected from each study was compared to the original report by two of the researchers of the study who were thoroughly aware of the coding protocol. The inter-coder reliability for the correlation data was 90%, indicating a high level of agreement between the two coders. In the case of disagreements, having discussed them, we reached agreement or consulted the original studies.

Meta-analytic Procedures

The details of each study were fed to the software Comprehensive Meta-Analysis (Borenstein et al. 2005). In case of the studies reporting corrected correlations, only the reported correlations were applied. Having completed the data entry, we took advantage of two approaches to manage the publication bias: first, classic fail-safe N, (Orwin 1983) and second, Egger’s Test of the Intercept. Both approaches showed little concern for publication bias.

Results

Results of the meta-analysis will be considered in three segments. First, a consideration of the potential for publication bias (the so-called file drawer effect) to influence the interpretation of the results is presented. Second, the estimates of effects sizes for the three high-evidence correlates will be presented, followed by an analysis of variability in the effect sizes.

Publication Bias Information

In publication bias, incomplete studies which are not published may not be included in the analysis and this might indicate bias in the collection of the studies. This bias would overestimate the true mean effect size: therefore, it is important to measure the possibility of bias, and its potential influence on the conclusions and interpretations (Rothstein et al. 2005)

Different statistical procedures can be applied to assess publication bias. In the present study, the classic Fail-Safe N and Egger’s Test of the Intercept were used. The findings of Fail-Safe N indicated a z-value of 14.95 and a 2-tailed p value less than 0.01 with a fail-safe N of 1832 (Table 2). This means that 1832 ‘null’ studies should be found and incorporated for the combined 2-tailed p value to exceed 0.050 which seems a rather large number of studies showing lack of publication bias in this study.

Table 2 Classic Fail-Safe N

In addition, the findings of Egger’s Test of the Intercept showed an intercept of 0.05, 95% confidence interval (− 6.138, 7.134), with t = 0.153, df = 30 (Table 3). The recommended 1-tailed p value is 0.43, and the 2-tailed p value is 0.87. The higher level of deviation of the intercept from zero shows more noticeable level of asymmetry in the combination of the studies and; hence, the existence of the bias. A p value of 0.1 or smaller for the intercept is considered statistically significant. As the intercept is .49, showing a rather small deviation from zero, but the asymmetry is not statistically significant (p = .87), thus, no evidence of publication bias can be found.

Table 3 Egger’s Test of the Intercept

Perceived Communicative Competence

Eleven available independent correlations from 9 studies with 2788 participants(see Table 4) were analyzed for perceived communication competence variable (mean sample size = 309.77, range = 56–579).

Table 4 Independent correlations between L2 WTC and PCC

The relevant statistics of these studies (their significance, 95% CI), with graphic representations are presented in Table 5 and Fig. 1 respectively. The overall mean correlation was moderate, r = .48, 95% CI [.38 to .56] and statistically significant (p < .001).

Table 5 Relevant statistics of L2 WTC correlations with PCC
Fig. 1
figure 1

Overall average correlation (displayed by a diamond) and correlation with confidence interval for each study correlating perceived communicative competence and L2 WTC. Note: a = WTC in meaning focused activities; b = WTC in form-focused activities in Peng and Woodrow. Note: a = WTC and PCC for Humanities students; WTC and PCC for Engineering Students in Ghonsooly, Khajavy, and Asadpour

Language Anxiety

A total of 12 independent correlations from 10 studies comprising 2876 participants (see Table 6) were analyzed concerning the variable language anxiety (mean sample size = 287.63, range = 56–579).

Table 6 Independent correlations between L2 WTC and anxiety

The summary of the study results is presented in Table 7, showing that the overall mean correlation was small(r = − .29), 95% CI [− .38to − .19] and statistically significant (p < .001) (see Fig. 2).

Table 7 Relevant statistics of L2 WTC correlations anxiety
Fig. 2
figure 2

Overall average correlation (displayed by a diamond) and correlation with confidence interval for each study correlating anxiety and L2 WTC. Note 1: a = WTC in meaning focused activities; b = WTC in form-focused activities in Peng and Woodrow. Note 2: a = WTC and anxiety for Humanities students; WTC and anxiety for Engineering Students in Ghonsooly, Khajavy, and Asadpour

Motivation

Nine correlations from 8 studies, including 2555 participants (see Table 8) were analyzed (mean sample size = 39.37, range = 56–579).

Table 8 Independent correlations between L2 WTC and motivation

The included studies, the relevant statistics, and their graphic representations are presented in Table 9 and Fig. 3, respectively. The overall mean correlation was moderate (r = .37), 95% CI [.32 to .42] and statistically significant (p < .001).

Table 9 Relevant statistics of L2 WTC correlations with motivation
Fig. 3
figure 3

Overall average correlation (displayed by a diamond) and correlation with confidence interval for each study correlating motivation and L2 WTC. Note: a = WTC in meaning focused activities; b = WTC in form-focused activities in Peng and Woodrow

Test of Heterogeneity

Several statistics could reveal the heterogeneity of the above estimates in the meta-analysis like Q, T2, T, and I2.

Perceived Communicative Competence

The variability across studies was statistically significant and large Q (10) = 88.81, p = 0.00, I2 = 88.74, which shows moderator variables influence perceived communicative competence. Table 10 illustrates that Q (88.81) is far bigger than the degrees of freedom (10). This means that T and T2 representing the measures of heterogeneity are big. In other words, the proportion of the real differences in effect sizes to their observed variance, I2, is close to 0.90, which is close to one and very big. As for I2, it is neither sensitive to the number of studies nor sensitive to the metric of the effect sizes. According to the benchmarks of I2 proposed by Higgins et al. (2003), values higher than 75% are considered high. Therefore, speculations about the possible reasons of variance in terms of outcome, setting, and proficiency are justifiable.

Table 10 Test of heterogeneity for the correlations between L2 WTC and PCC

Anxiety

Moreover, the variability across studies of anxiety was statistically significant and large Q (11) = 86.20, p = 0.00, I2 = 87.24, confirming there are moderator variables which affect the variable of anxiety. Table 11 illustrates that Q (86.20) is far bigger than the degrees of freedom (11); which means that T and T2 representing the measures of heterogeneity are big. In other words, the proportion of the real differences in effect sizes to their observed variance, I2, is 87.24, which is close to one and very big.

Table 11 Test of heterogeneity for the correlations between L2 WTC and anxiety

Motivation

In addition, the results of the test of heterogeneity showed that variability across studies of motivation was not statistically significant and large Q (8) = 13.77, p = 0.08, I2 = 41.93, indicating the presence of moderator variables influencing the motivation. Table 12 illustrates that Q (13.77) is not far bigger than the degrees of freedom (8) indicating that T and T2 representing the measures of heterogeneity are not fairly large.

Table 12 Test of heterogeneity for the correlations between L2 WTC and motivation

Discussion

The present meta-analysis examines the correlation between WTC and three high-evidence correlates (perceived communicative competence, anxiety and motivation). The results of meta-analysis indicated that perceived communicative competence (r = .48 and CI = [.38 to .56]), language anxiety (r = − .29 and CI = [− .38 to − .19]), and motivation (r = .37 and CI = [.32 to .42]) had significant correlations with L2 WTC. In addition to these three variables, there are several low-evidence correlates which are less frequently investigated by researchers in the domain of WTC, including attitudes toward L2, attitudes toward learning, proficiency level, future self, ideal L2 self, ought to self, listening, speaking, reading and writing. There were too few effect sizes available in the literature to address these variables using meta-analysis at this time. If future research continues to report correlation for these or other variables, additional meta-analysis might be undertaken.

Comparing the summary effects of studies regarding the relationship of L2 WTC with perceived communication competence, language anxiety, and motivation can shed more light on the power of each of these variables in their contribution to L2 WTC. It should be mentioned that the computation of effect sizes is a mathematical process, but their interpretation is dependent on the researchers. In other words, no statistical software is able to interpret the meaningfulness of computed effect sizes. Even the oft-cited set of criteria for the evaluation of effect sizes published by Cohen (1988) as small, medium, and large values was offered reluctantly because such criteria are appropriate only when there is not adequate knowledge available for making informed judgments (Hedges and Hedberg 2007). As Plonsky and Oswald (2014) proposed, benchmarks like Cohen’s are aimed for analysis of statistical power and; hence, are not appropriate for the interpretation of the findings of social research. In other words, interpreting an effect size as small or large without considering effect sizes in the related literatures is difficult. Therefore, the interpretation of effect sizes requires comparison, is relative, and is context-dependent (Plonsky and Oswald 2014). However, Cohen’s cautionary arguments seem to have been largely ignored by most researchers in social sciences. Contextualization of the interpretation of effect sizes means that the interpretation of effect sizes as large or small is context-bound. Recently, Plonsky and Oswald contended that “Cohen’s benchmarks generally underestimate the effects gained in L2 research” (Plonsky and Oswald 2014, p. 18) and they seem not applicable for the interpretation of effect sizes in the field of applied linguistics. Thus, they introduced some field-specific benchmarks of d and r based on a description of L2 research body encompassing 346 primary studies and 91 meta-analyses. To infer correlation coefficients, they proposed that rs close to .25, .40, and .60 be considered small, medium, and large, respectively (Plonsky and Oswald 2014).

However, Plonsky and Oswald (2014) put emphasis on an important point of caution. They asserted that “the benchmarks we have offered as a result of our study are meant to serve as very general indicators of the magnitude of mean differences and correlations typically observed in L2 research” (P. 14). Therefore, to have a more reliable and meaningful interpretation of the effect sizes in the field of applied linguistics, they introduced eight key criteria, among which one is to compare them with the previous studies and similar relationships (Plonsky and Oswald 2014). They also suggested that “…for primary studies, a meta-analysis of the domain or subdomain to which the study belongs, if available, is a great place to start” (Plonsky and Oswald 2014, p. 18) which contributes to the significance of this meta-analytic study for the interpretation of the future primary studies on L2 WTC.

Comparing the Findings of this Study with a Previous Meta-Analysis

For the current meta-analysis, we were not able to compare our findings to other meta-analyses on L2 WTC because our study is the first meta-analysis on the association between L2 WTC and its correlates. To find a solution, we followed Plonsky and Oswald (2014) that in case of the lack of a meta-analysis in a given domain, meta-analytic findings from other acknowledged settings such as education, cognitive science, and individual differences can be regarded as sources of comparison. Given that only one study (Jeon and Yamashita 2014) used correlation as the measure of effect size for the meta-analysis in the field of applied linguistics, and other studies mainly applied other measures of effect size like Cohen’s d or Hedge’s g; we have compared the findings of that study with those of the present one. Furthermore, the CIs of the two studies, as a representation of their sampling uncertainties have been compared.

Based on both the results of Jeon and Yamashita (2014) and the current study, we can conclude that the three high-evidence correlates of WTC have rather moderate correlations with willingness to communicate compared with the overall average correlation as found by Jeon and Yamashita (2014). In addition to the effect sizes of the two studies, the CI of these two studies should be considered in order to have more precise comparisons. The CI of the high-evidence correlates reported by Jeon and Yamashita (2014) had a wider range compared to those of the present study, especially the CI of the grammar knowledge and reading (.37). This means that the precision of the effect sizes of the present study is higher than those of Jeon and Yamashita (2014).

Considering the relative strength of the three high evidence correlates included in the present study, we can regard the effect of motivation (.37) higher than that of anxiety (.29). Also, we can consider confidence interval of motivation narrower (a spread of .10 for motivation and .19 for anxiety) than the confidence interval of anxiety and its standard error smaller than that of anxiety as well (.009 for motivation and .026 for anxiety). This indicates a more consistent relationship between motivation and L2 WTC, as compared to the relationship between anxiety and L2 WTC. However, the relationship between perceived communicative competence and WTC appears to be the strongest among the three high-evidence correlates of WTC (.48). This level is close to Cohen’s (1977) arbitrary cut-off of .50 for a large effect. The narrow confidence interval of .18 and the small standard error of .19 confirm the significant power of perceived communicative competence among the three correlates of WTC. This result is supported by the findings of previous studies (Cetinkaya 2005; Ghonsooly et al. 2012; Hashimoto 2002; MacIntyre and Charos 1996; Munezane 2013; Peng and Woodrow 2010; Yashima 2002; Yashima et al. 2004) implying that perceived communicative competence is the most significant predictor of L2 WTC.In the same vein, Yashima (2002) and Yashima et al. (2004) found a stronger correlation between EFL learners’ perceived communicative competence and WTC than between communication apprehension and WTC.

Furthermore, for correlation coefficients, Plonsky and Oswald (2014) suggested that rs close to .25 be considered small, .40 medium, and .60 large. As with mean difference effect sizes, these results show very clearly that the findings on the correlation between L2 WTC and perceived communicative competence, anxiety and motivation (.48, .29, and .37) are in line with the benchmarks determined by Plonsky and Oswald (2014) and they have small to rather medium effect size.

Investigating the Possible Role of Moderators in L2 WTC Correlates

Finally, the existence of moderators’ role within the summary effects of the relationship between WTC and two of the three high-evidence correlates can be explored based on the findings from the heterogeneity test for each of the three correlates. First, the I2 for anxiety emerging from the meta-analysis (87.24) and that of perceived communicative competence (88.74) are much bigger than that observed here for motivation (41.93). Given that the benchmark of I2 for the heterogeneity of the variables is .75, it can be interpreted that moderators likely play a significant role in the relationship between WTC with language anxiety and perceived communicative competence, but not motivation. However, the investigation of these moderators is beyond the scope of this study. Such a result is not surprising in the SLA literature given the large number of learner and contextual variables that come into play when considering WTC, such as intergroup processes, personality, self-related cognition, contextual variation in opportunities to use the L2, instructional practices, political or demographic trends, and so forth.

Conclusion

Over the fifteen years between 2000 and 2015 there was a sufficient amount of research published to allow for identification of three high evidence correlates of L2 WTC. The meta-analysis reported in the present study found that L2 WTC is more highly correlated with perceived communicative competence than with anxiety and motivation across studies, and it is likely that moderators play a pivotal role in the relationship of L2 WTC with anxiety, perceived communicative competence, and motivation.

However, regarding the limitations of the study, we should note that among the individual studies used in this meta-analysis, different conceptualizations of motivation, socio-educational or self-determination perspectives, and anxiety were addressed and they were not considered in the synthesis of the research on WTC. In other words, had we categorized the different conceptualizations of motivation and anxiety separately, we would have arrived at a limited number of correlational studies which could not justify the rationale of a meta-analysis. The different operational definitions of the L2 WTC high-evidence correlates can influence the interpretation of the findings. Our results also pave the way for future investigation by exploring relatively under-investigated variables which have implications for shedding more light on L2 WTC. These variables include (but are not limited to) attitude towards L2, attitude towards learning, proficiency level, future self, ideal L2 self, ought to self, age and gender. Although they have been less frequently explored, they might have a potentially important role in predicting L2 WTC. Thus, along with research into dynamic changes in situated WTC, researchers are invited to make more efforts concerning the study of the longer-term moderators and mediators that contribute to variation in L2 WTC in a variety of different contexts.

Language learning context (foreign language or second language and classroom environment) was among the potentially important moderator variables not sufficiently examined in the present study. Language learning context can affect WTC and its correlation with other variables. Therefore, future studies can examine the role of context. With the advancing ecological research in the field of applied linguistics, we hope that the contextual effects of L2 WTC correlates can be explored. This kind of research could help researchers better understand the interaction among different variables and WTC in different contexts. Recent research in a laboratory context (MacIntyre and Legatto 2011; MacIntyre and Serroul 2015) suggests that the dynamics underlying rapid changes in L2 WTC, as communication events unfold, are an interesting avenue for future research and theorizing. In addition, recent research on WTC in a classroom context shows further evidence of the complexity of the processes underlying WTC in a second language (Mystkowska-Wiertelak and Pawlak 2014). Findings of this study can be used to help us understand both the macro- and micro-perspectives of a learners’ experiences with both state and trait levels of L2 WTC.