A multiword expression, or a formulaic sequence, is a combination of two or more words that co-occur adjacently either as a free or a more restrictive combination, such as idioms, lexical bundles, phrasal verbs, and collocations (Cowie, 1981; Nesselhauf, 2005; Wray, 2002). Multiword expressions are essential in understanding and producing English, as they comprise approximately 50% of spoken and written English (Erman & Warren, 2000). Although multiword expressions are indispensable for achieving proficiency in English, not all L2 learners are able to achieve competency with multiword expressions. Corpus studies have reported that college-level ESL students were less likely to use multiword expressions in writing than their English L1 counterparts (e.g., Laufer & Waldman, 2011; Siyanova & Schmitt, 2007). Studies have also demonstrated that college-level ESL students knew only a limited number of multiword expressions (e.gMacis & Schmitt, 2017; Nguyen & Webb, 2017). Vu and Peters (2022a) have introduced common factors responsible for the difficulty in learning and using multiword expressions, such as L1-L2 congruency (e.g., do an effort in Dutch vs. make an effort in English) and semantic opaqueness (e.g., once in a blue moon). In order to expand our understanding of how L2 learners identify the meanings of multiword expressions, this study focused on noun-noun compounds, a type of multiword expression that is under-investigated in L2 research. The specific issue explored was how L2 learners interpret semantically ambiguous noun-noun compounds.

According to Selkirk (1982), the following are the types of compounds common in English: noun-noun (e.g., school bus), verb-noun (e.g., play date), noun–verb (e.g., window shop), adjective-verb (e.g., dry clean), verb-particle (e.g., pick up). In principle, the head of a compound is the element on the right-side, and the head determines the part of speech of the compound, with the exception of verb-particle compounds. The verb-particle compounds are also referred to as phrasal verbs and have been investigated extensively in multiword expression research (e.g., Gardner & Davies, 2007; Siyanova & Schmitt, 2007; White, 2012; Zareva, 2016). Among the types of compounds, noun-noun compounds pose a unique challenge to L2 learners due to their semantic ambiguity.

Noun-noun compounds are the most common and most productive type (Bauer, 1987; Semenza & Luzzatti, 2014); that is, an unlimited number of new compounds can be created by language users and be potentially added to the English lexicon. In fact, Tanaka and Baldwin (2003) reported that static English dictionaries provided only 27% coverage of noun-noun compounds that occurred ten times or more in the British National Corpus. Although dictionary definitions are usually unavailable for newly created compounds, some of them have socially agreed definitions. For example, a less established compound, chocolate book, can refer to either “a book on the topic of chocolate” or “a book made from chocolate (book-shaped chocolate),” but English L1 users would be more likely to choose the former meaning if the compound was presented without a specific context.

That is, depending on how the two nouns are combined in meaning interpretation, the same compound can be interpreted differently, with one meaning possibility more preferred. As pointed out by vocabulary researchers (e.g., Bauer, 2017; Nagy, 1997), being able to distinguish between preferred and non-preferred meanings is a challenging task for L2 learners, especially those who have limited L2 proficiency and cultural exposure. This study investigated whether L2 learners would be able to interpret ambiguous noun-noun compounds in a nativelike manner, comparing the performances of college-level ESL students with three different proficiency levels. The next section summarizes relevant findings and theories from two areas of research, multiword expression and noun-noun compound processing and interpretation.

Literature Review

Processing L2 Multiword Formulaic Expressions

Although formulaic expressions may be processed word-by-word (Siyanova-Chanturia, 2015), some formulaic expressions can be stored as a whole rather than as individual words in language users’ long-term memory and are also retrieved as a whole from memory (Wray, 2002). In other words, formulaic expressions provide chunks for language users to rely on when they comprehend and produce language. Ability to use and understand formulaic expressions is an important aspect of language competency because it contributes to nativelike fluency (Boers et al., 2006; Pawley & Syder, 1983; Tavakoli et al., 2019; Yan, 2020). Being able to process a majority of the language in memorized chunks enables language users to reduce processing time and cognitive load that would otherwise be required for conducting a word-by-word analysis. Some formulaic expressions also help language users communicate more effectively with others who share the same sociocultural background (Pawley & Syder, 1983).

The usage-based approach in L2 learning explains that formulaic expressions are learned by repeated exposure to the expressions (Ellis, 2001, 2002, 2003; Tomasello, 2000, 2003). As L2 learners receive repeated exposure, they are able to establish associations between the form and meaning of an expression in their long-term memory. As a result, they become able to retrieve the expression from their memory as a chunk, rather than conducting a word-by-word analysis. In the usage-based approach, frequency of input is a critical factor because the more frequently learners encounter formulaic expressions, the stronger the associations they establish in their long-term memory. Empirical findings suggest that L2 learners are sensitive to the frequency of occurrence of formulaic expressions (Ellis et al., 2008).

Comparing L1 and L2 multiword processing, Wray (2002) argued that L2 learners process multiword expressions word-by-word, in a manner fundamentally different from L1 users. However, more recent findings seem to agree that L2 learners are able to process multiword expressions in a way similar to L1 users, but various factors also influence their processing (e.g., Wolter & Yamashita, 2018). One such factor is the congruency between L1 and L2 multiword expressions (e.g Carrol et al., 2016; Wolter & Gyllstad, 2013; Yamashita & Jiang, 2010). For instance, Wolter and Gyllstad (2013) found that Swedish L1 college-level ESL students were faster and more accurate in judging the acceptability of an adjective-noun collocational phrase when it was a congruent collocation, which had a one-to-one equivalence in their L1 (e.g., commercial break), than an incongruent collocation (e.g., real estate).

L2 proficiency and exposure also influence multiword processing. For instance, Yamashita and Jiang (2010) reported that ESL college students who had higher proficiency and more experience living in an English-speaking environment were faster and more accurate in a collocation acceptability judgment task compared to EFL college students who had lower proficiency and no experience living in an English-speaking environment. Nevertheless, the ESL students still struggled with L1 incongruent collocations. Moreover, based on an extensive review, Conklin and Schmitt (2012) also concluded that lower-proficiency English L2 college students tended to process formulaic expressions in a word-by-word manner, while more advanced learners processed formulaic expressions by chunking, similar to the way English L1 users did. These proficiency-based differences can be explained by the usage-based approach of L2 learning. As learners achieve higher proficiency, they also accumulate more L2 input, which presumably helps them notice, retain, and retrieve multiword expressions in a more nativelike manner. Furthermore, recent findings have uniformly suggested that L2 learners who had more vocabulary knowledge were better able to learn multiword expressions incidentally through reading (Vilkaitė, 2017; Vu & Peters, 2022b, 2023).

Processing and Interpretation of Noun-Noun Compounds

Given that noun-noun compounds are multiword expressions, the research specific to noun-noun compounds is also concerned with whether they are processed word-by-word or as a whole. The interactive model of morphological processing (e.g., Taft, 1994) maintains that compounds are stored in the mental lexicon either as individual words or as a whole. When language users retrieve the meanings of compounds from the mental lexicon, they retrieve them either word-by-word or as a whole, depending on which route is more efficient in terms of processing load (Ji et al., 2011; Pollatsek et al., 2000). Earlier research in word recognition (e.g., Sandra, 1990) argued that word-by-word decomposition would occur only for semantically transparent compounds (e.g., hometown) because word-by-word analysis would lead to an incorrect meaning for semantically opaque compounds (e.g., honeymoon). However, more recent findings suggest that individual words’ meanings can be activated even for semantically opaque compounds, both in L1 (e.g., Libben et al., 2003) and L2 (e.g., Li et al., 2017).

Although noun-noun compounds may be processed either word-by-word or as a whole, research related to the semantic interpretation of less established noun-noun compounds seems to assume word-by-word processing as a default. Due to the high productivity of noun-noun compounds, it is highly likely that language users will encounter less established or novel noun-noun compounds whose definitions are not known to them. In order to identify the meanings of unfamiliar compounds, language users have to conduct a word-by-word analysis and infer the combined meanings of the modifier and head (Connolly et al., 2007). For combining the meanings of the two nouns in noun-noun compounds, linguists contend that the semantic relation between the modifier (the first noun) and the head (the second noun) provides crucial information (Downing, 1977; Gleitman & Gleitman, 1970; Levi, 1978; Warren, 1978). Table 1 presents the classification by Levi (1978), which has served as a framework in more recent compound-recognition research (e.g., Estes & Jones, 2006; Gagné, 2001, 2002) as well as in corpus and NLP research (e.g., Barker & Szpakowicz, 1998; Ponkiya et al., 2018).

Table 1 Relation Classifications (Levi, 1978)

The classification is based on an analysis of the words deleted in the compounding process. Table 1 summarizes the word categories and their corresponding semantic relations (a total of 12 relations) between modifier and head, along with some examples. For instance, steam iron is categorized as the USE relation, referring to “iron that uses steam.” The word categories CAUSE, HAVE, and MAKE allow either the modifier or the head to be the subject of the recovered clause, while the rest of them allow only the head to be the subject of the clause. For instance, tear gas and drug deaths are both categorized as the CAUSE relation, referring to “gas that causes tears” and “drug that causes deaths,” respectively.

Although the classification proposed by Levi (1978) provided a foundation for research in noun-noun compounds, it should be noted that the classification is not an exhaustive list of semantic relations. For instance, the classification by Downing (1977) included semantic relations such as HALF-HALF (giraffe-cow) and OCCUPATION (coffee man). Studies have also pointed out that some compounds require unconventional semantic relations (Culicover et al., 2017; Jackendoff, 2010), such as dog person (“person who likes dogs”) and bat boy (“boy who picks up bats” or “boy who looks like a bat”), which cannot be explained by the classification. Interpretation of novel compounds requires language users to understand the semantic details of the constituent nouns as well as the discourse and extralinguistic contexts in which the compounds are being used. Although L1 users have implicit knowledge about how noun-noun compounds should be interpreted by filling in the most socially appropriate semantic relation, L2 learners may not always possess native-like knowledge in identifying the appropriate semantic relation.

Semantic Ambiguity of Noun-Noun Compounds

Because noun-noun compounds are the most productive in English, there are many less established compounds whose definitions are not available in a dictionary. English L2 learners need to infer the meanings of less established compounds by identifying the semantic relation between the modifier and head. Although the context in which compounds are embedded can certainly provide a cue to the meanings, there is research suggesting that some context is misleading or unhelpful (Beck et al., 1983) and that compound interpretation is initially dependent only on the constituent morphemes (Cohen & Staub, 2014). As vocabulary researchers (e.g., Bauer, 2017; Nagy, 1997) suggest, less established compounds can pose difficulty to L2 learners because many of the compounds require some implicit L2 linguistic and cultural knowledge. For instance, book shelf and toy shelf both require the FOR relation, with the modifier indicating the materials to be stored. However, the same relation cannot be applied to kitchen shelf.

Although the selection of semantic relation for less established compounds may appear to be arbitrary, the Relational Interpretation Competitive Evaluation (RICE) theory (Spalding & Gagné, 2011), an updated version of the theory proposed by Gagné and Shoben (1997), provides a usage-based account of the psycholinguistic processing involved in the selection of semantic relation. When language users interpret the meanings of compounds, they consider all possible semantic relations, but select the relation that is more frequently used with a given modifier. For example, the modifier, mountain, is more frequently interpreted as the IN relation, such as in mountain house and mountain clouds. Therefore, language users would first choose this particular relation based on their experience. Once the relation candidate is selected, the plausibility of the interpretation is evaluated based on whether the head has the correct properties for the relation selected because of the modifier.

There are a number of empirical findings that lend support to the RICE theory (e.g., Estes & Jones, 2006; Gagné, 2001, 2002; Gagné & Spalding, 2009; Gagné et al., 2005). For instance, in Gagné et al. (2005), English L1 college students were shown ambiguous novel compounds, along with their possible meanings, each reflecting a different semantic relation. In an online task, the students indicated the most preferred meaning for each novel compound. The results demonstrated that some compounds had a dominant meaning (e.g., for woman judge, “a judge that is a woman” was preferred 96% of the time) compared to the alternative meaning (“a judge for a woman”). In contrast, other compounds, such as wool basket, were found to be more ambiguous, with the students selecting at about 50% each for the two possible meanings, “a basket for wool” and “a basket made of wool.” Interestingly, the reaction time data indicated that they were slower in selecting the meanings for the more ambiguous compounds with lower preference percentages, presumably due to the increased competition between two possible semantic relations. Moreover, based on an analysis of corpora from the English Lexicon Project (Balota et al., 2007) and the British Lexicon Project (Keuleers et al., 2012), Schmidtke et al. (2016) demonstrated that the diversity and frequency of possible semantic relations influenced the selection of semantic relation. These findings suggest that it is easier for language users to select a semantic relation for a compound that has fewer semantic relation possibilities, with one relation predominantly used with a given modifier.

The Study

The literature reviewed above offers important premises regarding how L2 learners interpret the meanings of less established noun-noun compounds. According to the RICE theory (Gagné & Shoben, 1997; Spalding & Gagné, 2011), learners’ sensitivity to the usage of compounds, including sensitivity to the diversity and frequency of semantic relations, contributes to the understanding of the meanings of noun-noun compounds. In order to interpret the meanings, learners need to identify the semantic relation between the modifier and the head based on their experience with other compounds that have the same modifier. This process can be challenging to L2 learners because their L2 experience, including both cultural and linguistic experience, is often limited and different from English L1 users. At present, the only study available on this topic is a descriptive study by Zhou and Murphy (2011), in which college-level Chinese L1 EFL students inferred the meanings of compounds and novel compounds. Some of the students’ errors included incorrect semantic relations (e.g., “burger made of cheese” for cheeseburger), which the researchers attributed to a lack of exposure to English-speaking culture.

Research in multiword formulaic expressions also underscores the importance of L2 exposure in developing nativelike competency. Learners with more advanced proficiency are better able to notice, process, and retain multiword expressions that are encountered more frequently (e.g., Durrant & Schmitt, 2010). Compounds, especially those written with a space between individual words, are indeed multiword expressions; therefore, it is reasonable to assume that L2 proficiency and amount of compound exposure also influence how L2 learners process and learn compounds. The aspect of compounds addressed in this study is the ability to identify the preferred semantic relation of ambiguous noun-noun compounds. Learners with an advanced proficiency level typically have more exposure to L2, including compounds. It is hypothesized that advanced-level learners would be able to identify the preferred semantic relation, which in turn would help them interpret the ambiguous compounds in a nativelike manner. Thus, this study investigated the following set of research questions:

To what extent are L2 learners able to identify the preferred meanings of ambiguous noun-noun compounds? Are there any differences according to their proficiency level?

Method

Participants

Three groups of English L2 college students participated in the study: an intermediate ESL group (n = 20), an advanced ESL group (n = 20), and a post ESL group (n = 19). The intermediate and advanced ESL groups were recruited from students enrolled in Level 4 and Level 6 classes at an intensive English institute in a mid-sized university in the US. The students received an invitation to the research project during their class time, and the participants were randomly selected from those who were interested in the study. A total of 6 levels of classes were offered at the institute, and each student was assigned to their appropriate level according to a placement test score. The students at the institute enrolled in ESL courses full-time, and after Level 6, they would be considered “graduated” from the institute and allowed to enroll as freshmen in an undergraduate program at the university. The post ESL group was the highest proficiency group and consisted of international students (3 undergraduate and 16 graduate students) enrolled in an academic program at the university. The label of “post” reflected the fact that they were English L2 students who were beyond the intensive ESL program because their English proficiency scores were sufficient for academic program admission. They were recruited through an online solicitation message, and those who wished to participate contacted the research team. For the undergraduate participants, freshmen were excluded because their English proficiency level might resemble the advanced ESL group, which was only one level lower than freshmen. In terms of alignment with a standardized proficiency scale, the intermediate ESL group (intensive English Level 4) was equivalent to Common European Framework of Reference for Languages (CEFR) B1, the advanced ESL group (intensive English Level 6) was equivalent to CEFR B2, and the post ESL group (undergraduate/graduate program) was equivalent to CEFR C1 or C2.

The participants filled out a questionnaire on basic demographic information, including age, gender, native language, and major (see Table 2). In addition to the three ESL groups, a group of native speakers of English, hereafter, the English L1 group (n = 30), participated in a norming task that was necessary for developing the task for the ESL participants. They were monolingual English-speaking students enrolled in undergraduate or graduate programs at the same university. The purpose of the norming task was to find out the meanings that native speakers of English prefer in a compound inference task, which will be explained in detail in the next section. Prior to the task, the participants were asked about their language background, and those who were fluent in languages other than English were excluded from participation in order to minimize any influence from other languages in choosing the meanings.

Table 2 Participant Characteristics

Tasks and Materials

A compound inference task was constructed for this study. In the task, the English L2 participants were shown semantically ambiguous novel compounds and asked to identify the meanings preferred by L1 users. First, in order to select the novel compound candidates, a norming task, which included 30 novel compounds and their two possible meanings, was created based on the items used in previous studies (Gagné & Shoben, 1997; Gagné et al., 2005). The norming task was administered to the English L1 group individually. For each novel compound, the participants were asked to choose the meaning they preferred without consulting any references. If there was a different meaning they preferred, they were asked to write it down. It took approximately 20 min for them to complete the task.

The English L1 group’s preference percentage was calculated for each compound, and the compounds that had 75% or higher preference were selected for the task for the ESL groups. After this screening, a total of 16 compounds were selected (see Appendix 1), with a mean preference of 88.50%. Each compound was accompanied by four answer options (see Appendix 2 for a sample), including the two possible meanings (dominant and non-dominant meanings) and two distractors, which did not incorporate the modifier’s meaning, following the format used in Zhang (2013). For instance, the novel compound, child art, was shown along with the following options: Art that is made by a child (dominant meaning), Art that is created for children (non-dominant meaning), Art that is hung on the wall (distractor), Art that is expensive (distractor). The order of the answer options was randomized in the task.

In order to ensure that the ESL groups were familiar with the words used in the task, including individual words within the compounds, the words were checked using the ESL institute’s vocabulary list, a comprehensive list of words that students at each level had to master before moving to the next level. All of the words were included in the lists prior to Level 4 (the lowest proficiency group), which ensured that the participants should be familiar with the words. In addition, to make sure that the compounds were in fact “novel,” it was verified that none of them were included in the American Heritage Dictionary, 5th edition (2012). Ten of the compounds appeared in the Corpus of Contemporary American English (COCA), but the mean frequency of the compounds was only 5.36, with the lowest frequency being 1 and the highest frequency being 12. Therefore, it was determined that the compounds were less established and appropriate for serving as novel compounds in this study. Finally, two ESL instructors at the institute checked the grammar and words used in the task and confirmed they should be familiar to the participants, including the lowest proficiency group.

Procedures

The tasks were administered to the three ESL groups in a paper-and-pencil format in a quiet room. The participants first received the informed consent information and filled out the questionnaire. They then completed the compound inference task, which everyone completed within 30 min. The intermediate and advanced ESL groups completed the task in their classrooms. The post ESL group were recruited from international students at the university at large, not from classes all of them were enrolled in. Therefore, the data was collected either individually or in a small group because it was not possible to set up a day to collect data from all of the participants at the same time. The participants were asked not to disclose the research activity they had participated in to anybody else. In all groups, the participants completed the task independently without relying on any other source of information, such as a dictionary.

Results

The compound inference task was a multiple-choice format. Therefore, the scoring was done by counting the participants’ responses using an all-or-nothing criterion for each answer option. Then, for each participant, the mean scores for the dominant and non-dominant answer choices were calculated and compiled into the group means.

Table 3 displays a descriptive summary of the task, including the mean percentages, standard deviations, and 95% confidence intervals for the dominant and non-dominant answer options. The means for the intermediate ESL group were 60.00 for the dominant option and 35.63 for the non-dominant option. The means for the advanced ESL group were 68.75 for the dominant option and 27.81 for the non-dominant option. The means for the post ESL group were 75.00 for the dominant option and 23.03 for the non-dominant option. A one-way Analysis of Variance (ANOVA) was performed on the means for the dominant answer option, which was the expected answer. The results indicated that the group differences were significant, F (2, 56) = 10.839, p < 0.001, ηp2 = 0.279. Multiple comparisons using the Bonferroni test indicated that the advanced and the post ESL groups were significantly more accurate than the intermediate group, p = 0.025 and p < 0.001, respectively. The difference between the post ESL and advanced groups was non-significant.

Table 3 Descriptive Summary of the Compound Inference Task

Next, in order to further examine the groups’ performance, the means from a subset of less ambiguous compounds and a subset of highly ambiguous compounds were calculated. Based on the preference percentages from the English L1 group, the low ambiguity type included the compounds that had the six highest preference percentages (96.67 – 100%; M = 97.22%), and the high ambiguity type included the compounds that had the five lowest preference percentages (76.67 – 80%; M = 78%), leaving the five compounds in the middle range excluded for this analysis. Figure 1 summarizes the comparison of the mean accuracy percentages (the percentages for the dominant option) for each group. The means for the intermediate ESL group were 71.67 (SD = 23.63, 95% CI: 64.22–79.12) for the low ambiguity type and 51.00 (SD = 25.53, 95% CI: 40.78–61.22) for the high ambiguity type. The means for the advanced ESL group were 84.17 (SD = 13.76, 95% CI: 76.72–91.62) for the low ambiguity type and 54.00 (SD = 16.03, 95% CI: 43.78–64.22) for the high ambiguity type. The means for the post ESL group were 92.98 (SD = 8.45, 95% CI: 85.34–100.63) for the low ambiguity type and 57.89 (SD = 25.73, 95% CI: 47.40–68.38) for the high ambiguity type.

Fig. 1
figure 1

Comparison of Mean Percentages for the Low and High Ambiguity Types

A two-way repeated measures ANOVA was performed on the group means, with ambiguity type as the within-subject factor and group as the between-subject factor. The results indicated that the main effect for the ambiguity type was significant, F (1, 56) = 60.778, MSE = 24183.81, p < 0.001, ηp2 = 0.520, demonstrating that the low ambiguity type was more accurately answered than the high ambiguity type. The main effect for the group was also significant, F (2, 56) = 4.875, MSE = 1948.77, p = 0.011, ηp2 = 0.148, demonstrating that overall accuracy differed between the groups. However, the interaction between ambiguity type and group was non-significant. Finally, a two-tailed paired-samples t-test was conducted for each group to compare the difference between the low and high ambiguity types. The results demonstrated that all of the groups scored significantly higher in the low ambiguity type: the intermediate ESL group, t (19) = 2.725, p = 0.013, the advanced ESL group, t (19) = 6.464, p < 0.001, and the post ESL group, t (18) = 5.402, p < 0.001.

Discussion

The compound inference task asked the English L2 participants to choose the dominant meanings of ambiguous noun-noun compounds, which were the meanings preferred by the English L1 group. According to the usage-based theory of compound interpretation, the interpretation of ambiguous compounds depends on the semantic relation between the modifier and head, selected by language users based on the frequency and diversity of possible relations (e.gGagné & Shoben, 1997; Spalding & Gagné, 2011). In the compound inference task, the L2 participants needed to evaluate the dominance between the two possible semantic relations, relying on their experience using English compounds. Results from the one-way ANOVA demonstrated that the advanced ESL group and the post ESL group were more accurate in identifying the dominant meanings than the intermediate ESL group. Advanced learners possess higher proficiency in English, including vocabulary knowledge, as well as more exposure to compounds. These findings clearly indicate that L2 proficiency is an important factor in interpreting ambiguous noun-noun compounds. Nevertheless, the lack of difference between the advanced and post ESL groups seems to imply that highly proficient learners still have difficulty with nativelike interpretation.

The comparison of the mean accuracies between the low and high ambiguity types offered further findings regarding sensitivity to the degree of ambiguity. The low ambiguity type was the compounds that had clearer dominance in one meaning than the high ambiguity type. All of the L2 groups, even the intermediate ESL group (lowest proficiency), were significantly more accurate in the low ambiguity type. These findings appear to suggest that L2 learners are sensitive to the semantic relation preferred in English, yet the extent of sensitivity differs according to their proficiency level; that is, the higher the proficiency level, the better the learners are able to notice the dominant semantic relation, presumably due to increased exposure to English compounds. However, the findings should be interpreted with caution because the participants’ L1 was not controlled in this study. The compounds used in the study were novel in English, but there might have been some L1 semantic congruency effects that interfered with the participants’ performance in identifying the dominant meaning (e.g., Carrol et al., 2016; Wolter & Gyllstad, 2013; Yamashita & Jiang, 2010). Needless to say, further research is warranted to verify the current findings.

Although this study was conducted using an offline task on the issue of semantics, the findings add new understanding to L2 multiword expression research. Regarding multiword processing, studies suggest that as learners achieve higher proficiency in L2, they become more sensitive to nativelike usage and process formulaic expressions by chunking in a manner similar to the way L1 users process formulaic expressions (e.g., Conklin & Schmitt, 2012). This study suggests that semantic understanding of multiword expressions is also an important competency that leads to nativelike usage of English. Although semantic ambiguity of multiword expressions has not been extensively researched, there are some studies that call for more attention. For instance, in a corpus study, Gardner and Davies (2007) found that phrasal verbs were highly polysemous, with the top 100 highest frequency phrasal verbs each having approximately 5.6 meanings. Zareva (2016) also reported that English L2 college students used a majority of polysemous phrasal verbs with only one meaning in their oral presentations. The achievement of English competency necessitates full control over the meanings of multiword expressions.

Conclusion

This study investigated how L2 learners understand the meanings of multiword expressions, focusing on ambiguous noun-noun compounds. Findings overall suggest that L2 learners are sensitive to the meanings preferred by English L1 users, yet they need to achieve higher proficiency in order to reach closer to nativelike interpretation. Highlighting the importance of semantics in multiword expressions, Gardner and Davies (2007) call for a more specific approach in teaching multiword expressions, such as introducing the multiple meanings of high frequency phrasal verbs through various contexts. Compounds are not commonly investigated in multiword expression research, presumably due to the fact that most compounds are not formulaic, with the exception of phrasal verbs (verb-particle compounds). However, compounds offer important insights into the processing and learning of formulaic expressions, such as collocations and idioms. It is hoped that the findings from this study provided insights into the semantic aspect of L2 multiword expressions.

Finally, limitations and future research suggestions are addressed. One of the major limitations of the compound inference task was the selection of novel compounds and their meaning options. They were selected from the items used in previous L1 compound-processing studies (Gagné & Shoben, 1997; Gagné et al., 2005), which resulted in offering meaning options with limited possibilities. Although it was necessary to control the number of meaning options for the study, in a future study, it would be beneficial to incorporate a more open-ended task that asks for participants’ interpretations of novel compounds without pre-selected options, as well as an increased number of novel compounds. In this study, the number of novel compounds needed to be fewer than the previous compound-processing studies with English L1 participants in order to make the task feasible for the participants, who were English L2 learners. Nevertheless, the number of items in the task was comparable to previous L2 multiword expression studies (e.g., 15 items in Vilkaitė, 2017; 12 targeted items in Yan, 2020).

Another limitation that needs to be acknowledged is the design of the compound inference task. The novel compounds were presented without context, in order to focus solely on the meanings of the modifier and head when the participants chose the most appropriate semantic relation. Contextual information plays an important role in the interpretation of compounds, especially compounds that are novel or less established (Jackendoff, 2010; Jackendoff, 2010). In future study, it is important to design a task in which compounds are presented in contexts (e.g., sentences, paragraphs) that replicate realistic encounters with compounds.

In addition, in order to verify the findings from this study, future research needs to consider learners’ individual differences, such as L1 and age, when assessing their ability to identify dominant meanings of noun-noun compounds, as L1 transfer is evident in the processing and learning of L2 multiword expressions (e.g., Wolter & Gyllstad, 2013). It would be necessary to develop a method that controls participants’ L1s, the target compound items, and the mean age across participant groups. In addition, a corpus study focusing on the semantic relations of noun-noun compounds would be beneficial, particularly for pedagogical purposes. An in-depth analysis of high-frequency compounds, including a list of all possible semantic relations and their contexts, could serve as a useful tool for L2 learners to obtain targeted exposure to English compounds and their meanings.