Introduction

Over the years, researchers have shown an increased interest in exploring student writers’ use of metadiscourse markers, with a focus on first language (L1) and proficiency effects (e.g., Aull, 2019; Bax, Nakatsuhara, & Waller, 2019; Ho & Li, 2018; Hong & Cao, 2014; Lee & Deakin, 2016). The concept of metadiscourse has been defined and understood differently across scholars; in this study, metadiscourse is viewed as “linguistic resources used to organize a discourse or the writer’s stance towards either its content or the reader” (Hyland & Tse, 2004, p. 157). It particularly focuses on the interactional dimension of metadiscourse that allows writers to convey their ideas more effectively to their audience (Hyland, 2005). Previous second language (L2) studies on interactional metadiscourse have demonstrated that learner essays, when compared with those by native English-speaking (NS) counterparts, tend to be characterized by the misuse and underuse of metadiscourse features (e.g., Çandarlı, Bayyurt, & Martı, 2015; Hyland & Milton, 1997; Lee & Deakin, 2016). Other studies have also examined the use of interactional metadiscourse markers in relation to learners’ L1 background (e.g., Hong & Cao, 2014) and L2 proficiency (e.g., Bax et al., 2019), and their findings have commonly pointed to the importance of increased pedagogical attention to interactional metadiscourse for L2 writers (detailed review of previous findings offered in the literature review section).

This study is grounded in Hyland’s (2005) interactional metadiscourse model which consists of two dimensions, namely, stance and engagement. Stance involves writer-oriented resources of interaction that writers would use when presenting their opinions and attitudes toward a proposition. In Hyland’s model, stance includes the categories of hedges, boosters, attitude markers, and self-mentions. Hedges are linguistic resources that express writers’ uncertainty about a proposition and accept the possibility of alternative ideas (e.g., arguably, maybe, usually), while boosters are to convey writers’ commitment and confidence in their argument (e.g., every single, evidently, obviously). Attitude markers show writers’ emotions and affective attitude towards a proposition (e.g., amazingly, cruelly, unusually). Self-mentions refer to explicit authorial presence in the text, which is realized mostly through first person pronouns (Hyland, 2001).

The other side of the framework, engagement, involves the writer’s recognition of readers’ presence and willingness to include them as legitimate discourse participants. The categories of engagement include reader pronouns, directives, questions, knowledge reference, and personal asides. Reader pronouns involve the use of second person pronouns and, in some contexts, first person plural pronouns (inclusive we). Directives are intended to direct readers to perform some acts mainly through the use of imperative constructions and obligation modals (Hyland, 2002a). From the engagement categories, this study examines only reader pronouns and directives because the remaining categories (questions, knowledge reference, and personal asides) have been found to occur minimally in timed argumentative writing (Hyland, 2005; Zhao, 2017).

While Hyland’s (2005) comprehensive model has gained much attention for empirical research, previous studies tended to focus on only a few categories (mostly hedges and boosters) and quantified occurrences of metadiscourse markers based on their subjective judgments that are often different from other studies. Also, potentially due to the labor-intensive nature of manually identifying target markers, the number of learner essays used for the exploration of metadiscourse has been relatively small. In an effort to see the developmental trajectory of how L2 writers express their attitudes toward a proposition and awareness of an audience, the current study involves a comprehensive quantitative analysis of 1986 learner essays using an automated processing tool that computes quantity scores of each metadiscourse category (Yoon, 2017a). The focus of this analysis is on identifying notable patterns of interactional metadiscourse by two learner variables (L1 background and L2 proficiency) and one task variable (writing topic). This study specifically attends to East Asian EFL students with different L1 backgrounds (Chinese, Japanese, and Korean) to reveal potential L1 background effects. These learner groups are targeted because of the large proportion of English language learners speaking these languages; students from China constitute 33.7% of the international student population studying in the United States, and those from China, Japan, and South Korea add up to 40.2% (Institute of International Education, 2019). Results from the learner data are then compared with those from native English speaker essays, a suitable target for EFL writers. Writing topic, as one of the task characteristics, has been explored in previous L2 writing research (e.g., Yoon, 2017b; Yang & Kim, 2020; Yang, Lu, & Weigle, 2015), but scholars have mostly attended to topic effects on linguistic features of syntactic complexity, accuracy, and fluency. Addressing these gaps in the literature, this corpus-based study aims to achieve a better understanding of L2 writers’ use of interactional metadiscourse markers in written argumentation and offer implications for writing instruction and assessment.

Interactional metadiscourse in L2 writing research

Argumentative writing has been the most typical genre for adult L2 writers due to its prevalent use in educational and testing contexts (Qin & Karabacak, 2010; Wolfe, 2011). Thus, despite the complex nature of establishing one’s position for convincing arguments, adult L2 writers are likely to have built task schemas and genre awareness for written argumentation. In the context of argumentative and research writing, previous studies has endeavored to identify distinct patterns of metadiscourse use between L1 and L2 writers (e.g., Çandarlı et al., 2015; Hyland & Milton, 1997; Lee & Deakin, 2016), and between L2 writers with varied L1 backgrounds (e.g., Hong & Cao, 2014; Lee & Casal, 2014). Using a contrastive perspective, these studies have commonly informed us that L2 learners have difficulties attaining native-like command of interactional metadiscourse in writing. For example, an early study by Hyland and Milton (1997) compared Chinese L2 learners’ essays with those by L1 writers with regard to hedge and booster expressions. They found L2 learners’ overuse of boosters and underuse of hedges, which led to Chinese learners’ stronger commitments to their assertions in writing. Similarly, Lee and Deakin (2016) examined the three corpora of 25 high-rated English as a second language (ESL) essays, 25 low-rated ESL essays (by undergraduate students speaking Chinese as their L1), and 25 high-rated L1 essays (by NS undergraduate students) in terms of their inclusion of interactional metadiscourse markers. The results of Lee and Deakin showed that the high-rated ESL and L1 essays contained higher numbers of hedges than the low-rated ESL essays. Their findings also indicated that the ESL essays commonly contained much fewer self-mentions than the L1 essays, while occurrences of boosters and attitude makers did not differ significantly between the corpora.

L2 scholars have interpreted distinct patterns of metadiscourse features between L1 and L2 writing, with a focus on cultural and contextual factors, as well as awareness of academic conventions (e.g., Hyland & Milton, 1997; Lee & Deakin, 2016; Li & Warton, 2012). For the overall trend of fewer types of metadiscourse markers in L2 writing, it has been argued that L2 learners’ limited repertoire of metadiscourse items lead them to overuse a narrow range of metadiscourse items repeatedly. It may also be their limited pragmatic competence that causes L2 writers’ difficulties in using metadiscourse appropriately across registers (Hong & Cao, 2014; Hyland & Milton, 1997). That is, some L2 learners, even with their knowledge of a large number of metadiscourse markers, may still find it challenging to use interactional metadiscourse properly in their writing due to potential differences in their expected, conventional use of interactional resources between L1 writing and target writing contexts (Lee & Casal, 2014). Additionally, the finding that L2 learners tend not to use adequate numbers of hedging expressions in their writing has been regarded as an outcome of their inaccurate concern that the inclusion of hedges would weaken the convincing force of their arguments. Their limited use of self-mentions and reader pronouns has also been attributed to their learning experience that informal language should be avoided in English academic writing. From a cultural perspective, Chinese student writers have been found to use fewer self-mentions in constructing their English texts than L1 counterparts, and it has been interpreted as East Asian students’ reluctance to express authorial presence in academic writing due to their collectivistic backgrounds (Bloch & Chi, 1995; Hyland, 2002b).

As illustrated above, the majority of previous research into interactional metadiscourse has focused on how L2 writing differs from L1 writing that serves as benchmark data, and notable differences between the two groups have been recognized as areas for pedagogical attention. While this line of research has advanced our understanding of interactional metadiscourse features in L2 writing, it should be noted that L2 learners are not a uniform group to be contrasted with L1 writers and their use of metadiscourse features may differ by L1 background. One attempt to address this issue was a corpus-based study by Hong and Cao (2014). They examined occurrences of interactional metadiscourse features in English as a foreign language (EFL) learner writing across three different L1s (i.e., Chinese, Polish, and Spanish). Their results showed no significant difference in the number of hedges among the learner groups but the essays by Chinese learners contained more boosters than the other two groups. This finding apparently coincides with earlier findings that Chinese writers would make assertive claims with reduced hedges and more boosters (Hyland & Milton, 1997), but it has often been considered evidence of how East Asian learners of English would use interactional resources, mainly due to the lack of research on interactional metadiscourse involving East Asian learners with different L1 backgrounds. Nevertheless, a few L2 studies endeavored to explore the writing of East Asian learners with different L1 backgrounds and in fact found some distinct patterns of language use among different L1 groups (e.g., Hinds, 1990; Hinkel, 2003, 2009). Based on the ethnic and linguistic composition of the East Asian English learner population (Williams, 2017), these studies tended to target Chinese, Japanese, and Korean speakers.

Interactional metadiscourse at different proficiency levels

Much attention has been given to the question of how L2 writers at different proficiency levels express their stance towards a proposition or the reader, and making arguments with a clear awareness of the reader’s presence has become one of the areas of L2 writing development, as specified by CEFR descriptors for writing (Council of Europe, 2018). If the appropriate and varied use of interactional markers in written discourse is indicative of L2 writing development, we may need to better understand specific patterns of interactional marker use by L2 learners at different proficiency levels and to give insight into how L2 writing instruction can facilitate their development. A few studies have explored this issue of how learners’ use of interactional metadiscourse markers varies by proficiency (e.g., Bax et al., 2019; Cheung & Low, 2019; Intaraprawat & Steffensen, 1995). For example, a recent study by Bax et al. (2019) examined a corpus of 900 essays composed by L2 learners at three different proficiency levels (CEFR B2, C1, C2 levels). Their findings showed that the essays by higher-level learners tended to display a greater variety of interactional markers than those by lower-level learners, while total numbers of markers included in the higher-level essays were fewer than those in the lower-level essays. Interestingly, this pattern of more types and fewer tokens with increased L2 proficiency was found for each of the stance categories targeted in their study (i.e., hedges, boosters, and attitude markers; except for self-mentions that only involve a few types of first person pronouns).

Another method employed in studies on interactional metadiscourse patterns across proficiency levels is to compare metadiscourse features in high quality essays with those in low quality essays (e.g., Cheung & Low, 2019; Ho & Li, 2018; Intaraprawat & Steffensen, 1995; Lee & Deakin, 2016). With the assumption that external measures of proficiency do not always correlate with writing quality, studies using this method have aimed to offer insight into the contributive role of metadiscourse markers for essay quality. Intaraprawat and Steffensen (1995), for example, examined the use of interactional metadiscourse markers (hedges, boosters, and attitude markers) in high-rated and low-rated essays, and they suggested greater occurrences of all metadiscourse categories in the high-rated essays than in low-rated ones. However, using a similar method of comparing high-graded and low-graded argumentative essays, Lee and Deakin (2016) found that the high-graded essays were similar to the low-graded essays in terms of boosters, attitude markers, and engagement markers. One notable difference between the two studies is that the participants of Lee and Deakin (2016) are all Chinese ESL students, while those of Intaraprawat and Steffensen (1995) are ESL students with a wide range of L1 backgrounds (e.g., Arabic, French, Japanese, Korean, and Spanish); it is plausible that their mixed findings might have arisen from the interaction between the two learner variables (i.e., L2 proficiency and L1 background). Thus, to disentangle the effect of L2 proficiency on interactional patterns in writing from that of L1 background, it may be necessary to explore a corpus of essays written by L2 learners from different L1 backgrounds who have been identified to be at different proficiency levels.

Topic effects on interactional metadiscourse

In the field of L2 writing research, there has been much research on the effect of topic on textual features (e.g., Yoon, 2017b; Yang & Kim, 2020; Yang et al., 2015). Specifically, as one of the main writing task variables, topic has been found to influence features of syntactic complexity (e.g., Hinkel, 2009; Yang et al., 2015) and lexical sophistication (e.g., Reid, 1990; Yang & Kim, 2020). Findings of topic effect research have informed us of how to interpret the contribution of syntactic and/or lexical features to writing quality validly in consideration of topic, which is a potential mediating factor. These studies have been motivated by clear evidence that some syntactic and lexical features are good indicators of writing proficiency and development (e.g., Bulté & Housen, 2014; Crossley & McNamara, 2014), and recent studies have begun to explore interactional metadiscourse as one of the areas that would be indicative of proficiency (e.g., Aull & Lancaster, 2014; Bax et al., 2019) and contribute to writing quality (e.g., Yoon, 2017a; Zhao, 2017), but there has been no empirical research that examines topic effects on interactional metadiscourse in L2 writing comprehensively.

Huot’s (1990) review of writing prompts noted that the three feature categories—(a) discourse mode, (b) rhetorical specification (writing purpose and audience specified), and (c) the wording and structure of writing prompts—are likely to influence learners’ writing performance, but previous findings of the relationships between these prompt features and writing performance have been inconclusive. Also, relatively little attention given to the wording of prompts indicates the need to explore potential wording effects, together with those of other macro-level prompt features, on L2 learners’ language use. This study addresses the need for a large-scale quantitative analysis of interactional metadiscourse, with a focus on a task variable (writing topic) and two learner variables (L1 background and L2 proficiency). Given the lack of theoretical explanations of East Asian students’ different use of interactional metadiscourse, this study is of an exploratory nature. The present research is guided by the following research questions:

  1. 1.

    How does writing topic affect EFL students’ use of interactional metadiscourse in argumentative writing?

  2. 2.

    How do EFL students with different first language backgrounds use interactional metadiscourse in argumentative writing?

  3. 3.

    How do EFL students at different second language proficiency levels use interactional metadiscourse in argumentative writing?

  4. 4.

    How does EFL students’ argumentative writing differ from that of NS writers in interactional metadiscourse use?

Method

Corpus data

For this study, I used the International Corpus Network of Asian Learners of English (ICNALE) (Ishikawa, 2013) that includes timed argumentative essays collected from college-level Asian EFL students from 10 countries (5400 essays) as well as native English speakers (400 essays). The EFL students have been classified into one of the adjusted CEFR levels (i.e., A2: waystage; B1.1: threshold, lower; B1.2: threshold, upper; and B2: vantage or higher), based on their performance on a proficiency test (e.g., TOEIC or TOEFL) or a vocabulary size test. Each student wrote two argumentative essays under time constraints; specifically, having access to the two prompts simultaneously, they had to complete two essays within a given time (up to 80 min for two essays). In the prompts, the students were asked to agree or disagree with the statements below:

  1. 1.

    It is important for college students to have a part-time job (hereafter, the Part-time topic).

  2. 2.

    Smoking should be completely banned at all the restaurants in the country (hereafter, the Smoking topic).

For its major quantitative analyses, this study used 1986 essays written by Chinese, Japanese, and Korean EFL students at the three proficiency levels (A2, B1.1, and B1.2). The essays at the B2 level were excluded from analysis because the number of students at this level was incomparably fewer than those at other levels (e.g., only 13 Chinese and 18 Japanese speaking students at B2). The learner essays were mostly written by college freshmen and sophomores (85.8%), and each of the L1 groups did not show a significant gender imbalance (203 female and 184 male Chinese speakers; 132 female and 250 male Japanese speakers; 140 female and 84 male Korean speakers). Identifying unique patterns of interactional metadiscourse features from the essays by these students will offer valuable implications for material development and writing instruction.

Additionally, I discuss to what extent East Asian EFL student writing is different from L1 writing with regard to interactional metadiscourse features by analyzing 400 L1 essays. It should be noted here that the L1-L2 comparison is not the focus of this study, thus limiting the use of L1 essay results to a post hoc analysis that follows detailed analyses of the EFL essays. Table 1 displays the overall size and the number of essays of each sub-corpus.

Table 1 Summary of the corpus data

Instruments

This study employed the Authorial Voice Analyzer (AVA) that calculates normalized frequencies of Hyland’s (2005) interactional metadiscourse categories (i.e., hedge, booster, attitude marker, self-mention, reader pronoun, and directive) (Yoon, 2017a). Of these categories, AVA produces both token (total number of items) and type (number of unique items) values for hedges, boosters, and attitude markers (i.e., categories with a wide variety of items) to identify potentially different roles of the extensive and varied use of stance features. For each category, as presented in Table 2, AVA computes normalized values (i.e., occurrences per 1000 words), which have been widely employed in previous research into interactional metadiscourse (e.g., Aull & Lancaster, 2014; Hong & Cao, 2014; Hyland, 2005; Lee & Deakin, 2016).

Table 2 Interactional metadiscourse features from AVA

AVA has been designed to cover a wide range of interactional metadiscourse markers and capture linguistic variability in word sequences through regular expressions. One example of booster devices is “without\s\w*\s?doubt*” in which “\s\w*\s?” indicates the possible insertion of an optional word. As a result, this item counts instances of without doubt, without any doubt, without much doubt, without a doubt, etc. An example of hedges is “(i|we)\s\w*\s?\w*\s?assume” that includes “|” (i.e., or) and “\s\w*\s?\w*\s?” (i.e., optional insertion of one word or two words), and this item counts instances of I assume, I would assume, I just simply assumed, we sort of assumed, etc. Another example from the hedge list is “(? < !in|un)(arguable|arguably)” that includes “? < !” (i.e., not following); this item counts occurrences of arguable and arguably as hedges but not inarguable, inarguably, or unarguable.

The hedge list of AVA includes a total of 164 expressions, and the booster list includes a total of 174 expressions. The attitude marker list built based on the existing lists of emotion and attitude words (Hu & Liu, 2004; Mohammad & Turney, 2013) contains 640 expressions. The categories of self-mentions and reader pronouns include personal pronouns: first person singular pronouns (I, my, me, and mine) for self-mentions, and first person plural (we, our, us, ours) and second person pronouns (you, your, yours) for reader pronouns. First person plural pronouns are categorized as reader pronouns, given their primary function as inclusive we in single-authored argumentative writing. For directives, AVA counts obligation modals (e.g., should, ought, have to, need to, must) and predicative necessity-related adjectives controlling a to-clause complement (e.g., it is essential to, it is necessary to, it is obligatory to, it is vital to) reported in Hyland (2002a). These predicative adjectives of necessity are all constructed with relevant regular expressions. For example, a directive item, it is essential to, is added to the list as “(it is|its)\s\w*\s?essential\s(for)?\s?\w*\s?\w*\s?\w*\s?to”, making it count varied expressions of this extraposed to clause (e.g., it is essential to, it’s essential for him to, it’s sometimes essential for a college student to, etc.). For directives, AVA also counts the number of imperative constructions that are identified by the Stanford parser (Klein & Manning, 2003).

This study examines a very large number of essays by EFL students at different proficiency levels. Normalized frequencies of tokens on the basis of 1000 words are used as target measures in this study. To validate these measures, Yoon and Römer (2020) hand coded 20 randomly selected MICUSP texts for hedge and booster features. After resolving all discrepancies through discussion, we checked the correlations of hand coded values with those from AVA. As a result, we found a correlation coefficient of .921 on hedges and .892 on boosters, offering evidence of AVA validly quantifying interactional metadiscourse markers. The full lists of the AVA items are available in the supplemental material of Yoon and Römer (2020).

Analysis

In exploring the main and interaction effects of three predictor variables (i.e., topic, L1 background, and L2 proficiency), I performed three-way ANOVAs with topic as a within-subjects variable and L1 background and L2 proficiency as between-subjects variables. In these analyses, normalized frequencies of metadiscourse markers were included as dependent variables. For self-mentions and reader pronouns, I used logarithmically transformed values to address their positively skewed distributions (Field, 2013). Each of the dependent variables showed descriptive results that satisfy assumptions for three-way AVOVAs (see Table 3 for descriptive statistics). In interpreting statistical significance, the alpha level adjusted by the Bonferroni correction (.0083 = .05/6) was adopted to decrease the possibility of Type I error. The magnitude of effects was estimated using partial eta squared (η 2p ), a valid effect size measure for research including within-subject variables. Cohen (1988) suggested η 2p values of .0099, .0588, and .1379 as small, medium, and large effects, respectively. However, given the suggestion that effect sizes should be interpreted differently in different research fields (Ferguson, 2009; Plonsky & Oswald, 2014), the use of effect size values in this study is rather restricted to their relative sizes across different measures.

Table 3 Mean and standard deviation values of interactional metadiscourse by topic, L1 background, and L2 proficiency

Results

Table 4 presents the interaction effects of the three independent variables on interactional metadiscourse features obtained from three-way ANOVAs. As shown in the first column, there was no significant three-way interaction among the effects of topic, L1 background, and L2 proficiency on any of the metadiscourse categories. The result also showed no significant two-way interaction effect between topic and L2 proficiency, nor between L1 background and L2 proficiency. However, there were significant interaction effects between topic and L1 background on various metadiscourse categories (i.e., hedges, attitude markers, self-mentions, and directives), which means that patterns of topic effects on EFL students’ interactional metadiscourse markers are likely to differ across their L1 backgrounds. The interaction effects of topic and L1 background were mostly small in size, with η 2p ranging from .011 to .037.

Table 4 Interaction effects of topic, writers’ L1 background, and L2 proficiency

Post-hoc analyses showed that the Chinese student group used significantly more hedges in the Smoking topic (p < .001, d = − .203),Footnote 1 while the Japanese and Korean groups displayed no significant difference in their hedge use across the topics (Japanese: p = .995, d < .001; Korean: p = .050, d = .132). In terms of attitude markers, the Korean group were the only group that showed a significant topic difference (i.e., increased use of attitude markers in the Smoking topic; p < .001, d = − .286). Furthermore, the Japanese and Korean groups increased their use of self-mentions in the Part-time topic (Japanese: p < .001, d = .387; Korean: p < .001, d = .326), but the Chinese students did not change their use of self-mentions across the topics (p = .742, d = − .017). The Chinese and Korean groups showed significantly increased use of directives in the Smoking topic (Chinese: p < .001, d = − .266; Korean: p < .001, d = − .338), while the Japanese group used significantly more directives in the Part-time topic (p = .001, d = .167). These results indicating the topic effect dependent on the writer’s L1, although statistically significant, should be interpreted cautiously because of the small magnitude of all interaction effects (η 2p from .011 to .037), and the main effect analysis in fact showed greater differences in the frequency of metadiscourse markers (medium to large effects).

Table 5 presents the main effects of each independent variable on metadiscourse categories. The result showed statistically significant effects of topic and L1 background on interactional metadiscourse. With regard to topic differences, it was found that EFL students used significantly more boosters and directives in the Smoking topic (boosters: p < .001, η 2p = .263; directives: p < .001, η 2p = .018), while significantly more self-mentions and reader pronouns in the Part-time topic (self-mentions: p < .001, η 2p = .052; reader pronouns: p < .001, η 2p = .068), with the largest effect size for boosters. These results suggest that topic may play an important role in directing the writer’s attention to varying dimensions of interactional metadiscourse, offering evidence of notable topic effects on interpersonal discourse features, aside from earlier research findings that have suggested topic effects on syntactic and lexical measures. Extracts 1 and 2 show some instances of boosters and directives in the Smoking topic essays.

Table 5 Main effects of topic, writers’ L1 background, and L2 proficiency

Extract 1 (KOR-SMK-192-B12)

Smoking must be completely banned at all the restaurants in the country. It’s not should be banned, it must be completely banned. As you know, there are so many smokers live in our world. And also there are few non-smokers live here. But you see, there is no place for non-smokers.

Extract 2 (CHN-SMK-197-B11)

So it’s necessary to ban smoking completely at all the restaurants in the country. In my opinion, smoking in the public, especially places like restaurant, is a immoral behavior. It’s really a bad experience that there has smoker around you when you were eating.

Extract 3 demonstrates some instances of self-mentions in the Part-time topic essays.

Extract 3 (KOR-PTJ-185-B12)

When I was a high school student 3 grade, I was doing a part time job. Then I met many people and I feel not good. Because guest is not good, work is very hard and it made me very tired. I did not know that time is priceless experience.

The result also showed that the EFL students have different patterns of metadiscourse use across their L1 backgrounds. As displayed in Table 5, L1 background effects were statistically significant on all types of metadiscourse categories, with the largest effect on self-mentions (η 2p = .117). Post-hoc analyses with the Bonferroni correction showed that the Japanese group tended to pay more attention to interactional metadiscourse than the Chinese and Korean groups, except only for reader pronouns (see Table 6 for summary). Specifically, the Japanese group was found to use more hedges than the other student groups; of the two remaining groups, the Chinese students used more hedges than the Korean students. The Korean student group also exhibited the fewest number of boosters, while the Japanese and Chinese groups showed comparable usage of boosters. Another notable finding is that, compared to the other groups, the essays by the Chinese students included the fewest number of self-mentions but the highest number of reader pronouns, which might be a reflection of their stronger ideology that emphasizes the importance of communities and groups over individuals (Bloch & Chi, 1995; Shen, 1989). Taken together, it can be suggested that East Asian EFL students with different L1 backgrounds have remarkably different patterns of interactional metadiscourse use, thus highlighting the need to avoid a monolithic conceptualization of East Asian learners of English (e.g., learners from a collectivist culture) in the context of L2 research.

Table 6 Summary of post hoc comparisons across L1 differences

Unlike the results of topic and L1 background, there was no significant effect of L2 proficiency on interactional metadiscourse. That is, the EFL students at varying proficiency levels showed a similar pattern of interactional features in argumentative writing. This finding can be interpreted in entirely different ways depending on how the ideal use of interactional metadiscourse is set. One of the possible explanations is that East Asian EFL students are likely to develop a good command of interactional metadiscourse at low L2 proficiency, particularly in terms of their extensive use, and accordingly they do not show further changes in the quantity of interactional metadiscourse as their proficiency level advances. An alternative explanation is that East Asian EFL students, even those at an intermediate-high level, have difficulty in noticing important functions of interactional metadiscourse and improving their ability to use interpersonal markers in written discourse. Given these competing interpretations, I further analyzed the essays composed by 200 native English-speaking (NS) college students under similar writing constraints. This dataset, a part of the ICNALE corpus, constitutes 400 essays on the same writing topics (i.e., Part-time and Smoking topics). For this analysis, I conducted a series of independent t-tests to reveal significant differences between the NS group and the EFL student groups at the highest proficiency level (B1.2) for each topic, which would enable me to avoid redundant analyses of L1 background or L2 proficiency effects.

As summarized in Table 7, there were several notable distinctions between the NS group and the EFL student groups. The results showed that the NS group used significantly more hedges than all EFL student groups (Chinese: p < .001, d = .96; Japanese: p < .001, d = .53; Korean: p < .001, d = .99 for Part-time; Chinese: p < .001, d = .56; Japanese: p = .001, d = .36; Korean: p < .001, d = .86 for Smoking). Additionally, the NS group was found to use significantly more self-mentions than the Chinese group (p < .001, d = .58 for Part-time; p < .001, d = .67 for Smoking). On the other hand, the NS group used a fewer number of reader pronouns than the EFL student groups (Chinese: p < .001, d = − 1.42; Japanese: p < .001, d = − .80; Korean: p < .001, d = − .68 for Part-time; Chinese: p < .001, d = − 1.11; Japanese: p = .001, d = − .33; Korean: p < .001, d = − .52 for Smoking). It was also found that the NS group used fewer directives than the Japanese group (p < .001, d = − .79 for Part-time; p = .001, d = − .65 for Smoking). These results are further explored from a developmental perspective in the Discussion section.

Table 7 Summary of significant differences between the EFL student groups and the NS Group

Discussion

This study explored the effects of one task variable (i.e., topic) and two learner variables (i.e., L1 background and L2 proficiency) on the quantity of interactional metadiscourse in written argumentation. There were significant interaction effects between topic and L1 background on hedges, attitude markers, self-mentions, and directives, indicating that topic effects on the occurrences of these metadiscourse markers differed by East Asian learners’ L1. However, as reported above, the size of the interaction effects was generally smaller than that of each of their main effects (topic and L1 background individually), which will be discussed in greater depth. The present results are similar to the findings reported in Hinkel (2009); examining the essays written by Chinese, Japanese, and Korean EFL students, Hinkel revealed that topic tended to affect their use of modal verbs in a consistent manner, but some topics exerted a much greater impact on a particular L1 group (e.g., Japanese learners’ overuse of possibility and ability modals in the essays on the selection of academic majors).

With regard to the research question related to topic effects, I found a clear difference in the EFL students’ use of interactional metadiscourse across the two topics, confirming the role of topic as one of the factors that elicits different interpersonal discourse features in writing. Specifically, the study found that the Part-time topic allowed L2 writers to express their authorial presence and awareness of imagined readers more explicitly (increased use of self-mentions and reader pronouns), while L2 writers’ opinions were more strongly expressed through increased use of boosters and directives for the Smoking topic. These findings may give support to some widely acknowledged patterns of discourse that Asian students tend to reduce explicit authorial presence expressed by personal pronouns when making strong arguments (Scollon & Scollon, 2001). Additionally, given these diverging directions of topic effects, I tentatively argue for the need to understand the construct of interactional metadiscourse in ways that reflect how interactional markers are actually used; some categories of interactional metadiscourse are closely linked (e.g., boosters and directives contributive to an assertive voice; self-mentions and reader pronouns for explicit writer-reader interaction), but such patterns of co-occurrences do not match the existing sides of Hyland’s (2005) interactional metadiscourse framework, namely, stance and engagement.

The finding of topic differences can be further discussed with regard to prompt wording. As illustrated in the Results section, boosters were found to be the interactional metadiscourse category with the largest topic effect (see Table 5), and looking into the exact wording of the Smoking topic informs us that its 12-word statement includes as many as three boosters (i.e., Smoking should be completely banned at all the restaurants in the country). Given this excessive inclusion of boosters in the Smoking topic, it is plausible that the EFL students might have been primed to use more boosters when responding to this topic. This issue can be of great importance in the context of testing because the inclusion of some stance markers in a prompt can implicitly lead test-takers to compose their essays with a particular point of view, potentially affecting their language and performance scores negatively (Hudson, Lane, & Mercer, 2005; Korbin, Deng, & Shaw, 2011). Therefore, writing prompts to be used in high-stakes test settings should be constructed with few biased or emotional words, unless the elicitation of particular language features is intentionally planned, like facilitating the use of hedging expressions.

My argument on priming effects of prompt wording on interactional metadiscourse is still speculative because this study did not make a direct comparison between writing samples from a prompt that includes, for example, multiple boosters with those from the same prompt without boosters. With further research with systematically manipulated prompts, we will be able to confirm if the manipulation of prompt wording functions as an effective strategy for L2 writing development. For example, when writing instruction aims to promote students’ production of hedges, students can be provided with short writing tasks that do and do not include several hedges in their prompts. Students who have experienced various writing tasks with and without hedges would raise awareness of the roles of hedges and, in turn, learn how to produce them appropriately. This implicit process of acquiring how to use hedging expressions can be followed by a more explicit post-writing phase that allows students to discuss various functions of hedges in making arguments.

Additionally, this study demonstrated clear differences in the use of interactional features across the different L1 groups. The essays written by the Korean students were characterized by fewer occurrences of hedges and boosters, while the essays by the Japanese students overall included greater numbers of stance and engagement markers than those by the other groups. Some notable features of the Chinese student essays, compared with those by the other groups, were a greater number of reader pronouns and a fewer number of self-mentions. These differences between the groups may be associated with how they build their arguments in persuasive discourse (Williams, 2017). Japanese speakers are likely to use more explicit linguistic markers that entail the functions of potentiality and ambiguity (Maynard, 1993), constructing their stance toward a proposition using an increased number of metadiscourse markers as a means of being polite and opening alternative ideas (Holmes, 1982). On the other hand, in an effort to avoid ambiguity or uncertainty (Paek, 2005), Korean speakers tend to focus on conveying their ideas and arguments straightforwardly with reduced attention to epistemic stance or engagement with readers. These findings may point to the importance of going beyond the broad categorization of East Asian learners of English as those who have a similar cultural and linguistic background in L2 pedagogy and research. For example, Shen (1989) found that Chinese students experience difficulties with first person pronouns in their English essays. Hong and Cao (2014) suggested that Chinese EFL students tend to use more interpersonal discourse markers in writing than Spanish and Polish EFL students. Given the large proportion of Chinese speakers to the entire English learner population, these findings have often been extrapolated to other East Asian learners of English. However, as found in this study, East Asian EFL students with different L1 backgrounds show notably distinct patterns of interactional metadiscourse features in their writing.

Contrary to the findings related to topic and L1 background, this study showed that the learner essays did not contain different numbers of interactional metadiscourse features depending on L2 proficiency. This finding, together with the results of clear L1 differences, provides support to what Jarvis (2000) suggested in his research on L1 influence. Given the finding of a remarkable similarity in vocabulary use among learners with the same L1, Jarvis argued “learners from the same L1 background, despite differences in age and L2 exposure, exhibit higher levels of homogeneity than learners from different L1 backgrounds who are matched according to age and L2 exposure” (p. 289). Confirming the important role of L1 background in L2 use, this study may also offer evidence that the essays written by the same-L1 learners, regardless of their L2 proficiency, are likely to display similar patterns of stance and engagement markers.

Comparing the L2 learner results with those from the NS group, this study demonstrated that L2 learner writing is likely to contain fewer hedges but more reader pronouns than NS writing, potentially identifying what categories of interactional metadiscourse we should attend to as a pedagogical target. I acknowledge that it is not ideal to use the essays composed by NS college students as the norm because NS students’ writing proficiency would also vary depending on their experience with academic writing (Hall & Navarro, 2011; Römer, 2009). Nevertheless, the NS essays used in this study might have served as the benchmark data more effectively than other potential datasets in that they were based on the same topics as those used for the EFL essays and that the L1 English-speaking college students had gained more English academic writing experience than the EFL students. Furthermore, previous studies indicated that the realistic aim of many L2 writers is to develop the ability to write essays similar to those of their NS counterparts (e.g., Zhou, Busch, & Cumming, 2014), justifying the methodological decision of using the NS student essays as the norm.

This study, similar to earlier research (e.g., Hyland & Milton, 1997; Lee & Deakin, 2016), revealed that L2 learners tend to underuse hedging markers in writing as compared to their NS counterparts, potentially leading to more straightforward, stronger assertions. In this study, the reduced use of hedges was identified as a common feature of all L2 learner groups, pointing to the need to include hedging as a target trait in L2 writing instruction. Additionally, the L2 groups’ greater use of reader pronouns than the NS group can be interpreted as the outcome of their cultural backgrounds that value harmony with other members in a community and regard the obvious projection of the self as being rude or disrespectful (Bloch & Chi, 1995; Shen, 1989). However, it should be noted that, while writers’ use of reader pronouns plays important roles in establishing a common ground with readers (Hyland, 2001), their overreliance on personal pronouns can still cause harm to the formality of academic writing. When observing any overuse of reader pronouns in L2 learner writing, writing instructors may need to make an effort to have their students notice its potential drawbacks by, for example, providing them with the opportunity to compare their essays containing an excessive number of reader pronouns with model essays (e.g., those reformulated by the instructor) that contain an appropriate number of reader pronouns.

This study offers insight into how to design a lesson on interactional metadiscourse, with a clear understanding of varying stance and engagement styles of different L1 groups. Based on the current findings, for example, a course for Korean students can focus on encouraging them to build more convincing and balanced arguments with increased hedge and booster markers; a course for Chinese students can be aimed at having students develop their ideas using more self-mentions so that they express their individual voices in writing more clearly. Finally, given the finding of no significant difference between learners at different levels, it can be suggested that (1) we should attend to the diversity or appropriacy of interactional metadiscourse markers rather than their quantity to capture development, or (2) interactional metadiscourse is a multidimensional construct whose development is not commensurate with the general improvement of L2 proficiency, possibly due to its optionality in context (i.e., an area with no binary outcome of being correct or incorrect). Nevertheless, as evidenced by earlier studies (e.g., Abbuhl, 2006; Cheng & Steffensen, 1996), the ability to use interactional metadiscourse markers can be promoted through focused instruction, indicating the need to explore the effectiveness of explicit and interventionist instruction for L2 learners’ extensive, diverse, and appropriate use of interactional metadiscourse features.

Conclusion

This study showed that L2 learners’ use of interactional metadiscourse differed across topics and L1 backgrounds. Based on the result of topic differences, the study discussed that when designing and/or selecting writing prompts, teachers and testing experts should take into account potential effects of prompt orientation and prompt wording on interactional metadiscourse. The findings related to learner variables provided valuable insight into what the development of interactional metadiscourse actually means and how to foster it in instructional contexts. Nevertheless, there are several limitations of this study that are worth noting for the careful interpretation of its findings. First, this study approximated proficiency improvement through a cross-sectional analysis of the corpus data. We may need a follow-up longitudinal study on the development of interactional metadiscourse awareness and use in writing to trace actual developmental trajectories over time. Also, this study examined the development, or significant changes, of interactional metadiscourse from a quantitative approach that focused primarily on the number of interactional metadiscourse markers in each text. This quantity-based analysis may fail to provide a full picture of learners’ interactional metadiscourse competence, so it would be of great importance to identify what types of interactional markers (within each metadiscourse category) are prevalent in learner writing and how the use of such frequently used markers varies with L2 proficiency. While the discussion of topic effects on interactional metadiscourse provides insight into teaching and assessment, they should be accepted with caution because this, as a large-scale corpus study, was based only on the two topics and there was no systematic manipulation of topic differences. Also, essays composed under time constraints may not fully reflect learners’ true writing ability because timed writing rarely involves revising or editing. Future research with many carefully modified prompts and extended time will be able to address these limitations and yield more generalizable findings. Interactional metadiscourse has become an important area in L2 writing and EAP research, as the writer’s proper interaction with the audience has been identified as one of the keys to successful academic writing. Future research addressing the points discussed above will enable us to achieve a greater understanding of how to construct an effective authorial voice in a variety of academic settings.