Introduction

It is widely acknowledged among researchers that the acquisition of vocabulary is the foundation of English learning (Paul Nation and Robert Waring 1997; Oxford and Scarcella 1994). English as a Second Language (ESL) learners and linguists have attempted to develop various learning theories and frameworks (Gu 2003; Hu and Nassaji 2016; Hulstijn and Laufer 2001; Keating 2008; Prince 1996; Schmitt 2008) to identify effective learning methods to promote word retention. With the rapid development of information technology in recent years, e-learning systems have been widely adopted in language learning for ESL students (Golonka et al. 2014; Wang et al. 2017). As revealed by several previous studies (Chen and Chung 2008; Chen and Li 2010; Zou and Xie 2018), conventional word learning systems have the limitation that the prior vocabulary knowledge of learners is not well understood and captured by these systems. There are two underlying reasons for this.

First, the user data obtained by learning systems are limited. There are two types of techniques, intrusive and non-intrusive techniques (Kaya and Bicen 2016; Ortigosa et al. 2014; Ramakers et al. 2012), for collecting user data for learning systems. Intrusive techniques refer to explicit data collection involving users, like user input, surveys, or feedback or the attachment of explicit data collection devices like sensors (Bedogni et al. 2012), EEG headsets (Campbell et al. 2010), or eye-trackers (Alemdag and Cagiltay 2018) to users, while non-intrusive techniques refer to implicit data collection like recording learning logs (Friesner and Hart 2005), recording click-through data (Joachims 2002), and exploiting devices without direct contact with the users [e.g., using digital cameras to collect videos of users for emotional analysis (Poria et al. 2015)]. According to the review study conducted by Fu and Hwang (2018), the devices used in technology-enhanced learning for data collection are mainly traditional portable computers. Furthermore, non-intrusive methods (Ortigosa et al. 2014) are harder to apply in technology-enhanced learning studies. Because of the limited kinds of devices and methods employed, the sources of data collected from users in e-learning systems are not adequately diverse and fruitful for in-depth analysis and the deep understanding of learners.

Second, it is challenging to interpret the user data obtained even if the data are diverse and fruitful because of the limited theories of user data interpretation for language learning. For example, it is still difficult to establish accurate connections between the patterns in the data from Magnetic Resonance Imaging (MRI) of the human brain and language learning processes, although some basic and rough patterns have been identified (Rahmani et al. 2017; Barbeau et al. 2017). Similarly, only some shallow relationships between eye gazing data and learning processes have been identified (Koć-Januchta et al. 2017). To sum up, it is very difficult to link the low-level biological data (e.g., EEG, eye gazing, MRI, etc.) of a learner to high-level semantics (e.g., learning status, affective status, etc.) even with deep neural networks (Khosrowabadi et al. 2014). In other words, building an effective model to interpret and represent user data in learning systems is largely constrained by this gap.

Without a good interpretation and understanding of the prior vocabulary knowledge in word learning systems, word learning systems are unable to cater to individual learning needs. In other words, it is difficult facilitate personalized word learning with word learning systems when the issues of limited interpretation and understanding are not addressed. Formally, personalized word learning (Chen and Chung 2008; Chen and Li 2010; Zou and Xie 2018) refers to employing personalized learning strategies in vocabulary learning processes. Specifically, personalized learning is defined as “instruction in which the pace of learning and the instructional approach are optimized for the needs of each learner. Learning objectives, instructional approaches, and instructional content (and its sequencing) may all vary based on learner needs. In addition, learning activities are meaningful and relevant to learners, driven by their interests, and often self-initiated” in the United States National Education Technology Plan 2017 (US Department of Education 2017).

Previous studies (Hsu et al. 2013; Lin et al. 2013; Jeong et al. 2012; Xie et al. 2016) mainly adopt the learner profile, a data-driven model for learner representation and interpretation to address the above research problem, to provide personalized learning. The learner profile can represent user data from various sources and help the e-learning systems to understand factors like learning styles, learning status, and prior knowledge levels. In this study, we focus on learner profiles for the personalized word learning. As mentioned, there are two categories of techniques, intrusive and non-intrusive, for user data collection. In previous studies, the methods of constructing learner profiles were classified as implicit and explicit methods (Zou et al. 2017a; Wang et al. 2018). Specifically, we conduct a further and extensive study on the following two research questions.

  • What are the system architectures of the personalized word learning systems based on learner profiles for ESL university students?

  • Which is the optimal method to integrate implicit and explicit learner profiles for personalized word learning systems?

The remainder of this article is organized as follows. In Sect. 2, we will review the relevant research studies of personalized word learning systems. Section 3 will specify each component in the proposed word learning systems. Section 4 introduces the methodologies employed in the experiment. Section 5 will report the empirical results of this study, analyze the results in depth, and discuss the pedagogical implications of this study. The conclusion will be drawn in Sect. 6.

Related work

Along with the irresistible tide of e-learning, recent decades have witnessed a similar flood of development of educational technologies—personalization (Chen and Chung 2008; Martins et al. 2008; Brusilovsky and Millán 2007), a concept that is increasingly expected to change the landscape of learning and teaching and that has attracted a great deal of worldwide scholarly attention, ranging from the investigation of nature and factors of personalization to the establishment of the conceptual framework of personalized learning (Tseng et al. 2008; Brusilovsky and Henze 2007).

According to Wang et al. (2004), personalization is a pedagogical response to the inherent diversity of learners’ knowledge background, skill levels, and preferences. It is believed that ideal personalization is to maximize the compatibility between the learning method and learners’ “particular educational needs and personal characteristics,” and to therefore enable the largest enhancement of their “satisfaction, learning speed and learning effectiveness” (Gómez et al. 2014), in terms of which word learning, of course, is among the beneficiary skills (Chen and Chung 2008). Now such an ideal is being realized thanks to the development of modern technology, especially that of mobile devices and wireless web (Chen et al. 2005). The data tracking systems and wide accessibility of mobile devices (Mobasher 2007) allow m-learning to involve diverse modes and methods, among which personalized learning is one of the most essential (Subramanya 2014), thereby earning itself a significant place as a particular feature of m-learning (Romrell et al. 2014; Cochrane 2010). As an e-learning factor, it is implemented mainly in the design level of interactive learning environments, where it is proposed that such elements be taken into consideration as “locus of control, learning styles, anxiety, tolerance for ambiguity, prior experience, interests, attitudes, and disabilities” (Reeves and Reeves 1997). For all the extensive discussions of personalization factors, they mainly center on two keywords. One is autonomy. Personalization puts learners’ choice in the center (Baker and Clarke-Midura 2013; Bray and McClaskey 2015), ranging from ensuring a learning pace and learning styles catering to their preferences to providing learning content compatible with their needs and interests. In word learning, it is suggested that learners be allowed to choose the words to learn and create vocabulary lists on their own—only then would word knowledge be of greater salience and of longer retention (Swaffar 1988). The other is scaffolding. This is based on the acknowledgment of the crucial position of a learner’s ability and feedback in the learning process and effect (Chen et al. 2005), as well as on the reasonableness of the zone of proximal development (ZPD) (Vygotsky 1978). ZPD theory holds that the learner would be “frustrated” or “presented with no challenge” if the instruction is too difficult or too simple; the ideal level of learning materials should fit into the “zone” between an individual’s highest and lowest ability limits—which is exactly the goal of scaffolding (Hammond and Gibbons 2005). In word learning, personalization is expected to support adaptation and befit scaffolding by addressing the exact problem of a given individual and offering different levels of support according to their different abilities. Learners’ creation of a word list could be counted as an example of two factors: By being allowed to select the target words in personalized word lists according to their interests and abilities, learners are expected to engage in deeper processing and longer retention of the learned knowledge due to their achievement of autonomy, as well as having their problems exactly met and knowledge properly digested because of the scaffolding.

On a larger scale, a variety of conceptual systems for personalized word learning have been proposed from different perspectives. Some focus on personalization in the educational strategy level, represented by theories of curriculum sequencing (Brusilovsky 2003; Chen et al. 2006; Hübscher 2000) and adaptive presentation (Papanikolaou et al. 1999; Wang et al. 2004); others regard Internet techniques as the basic requisite of personalized learning and investigate the realization of personalization by e-commerce (Wu et al. 2003), web searches (Sugiyama et al. 2004), web data mining (Lin et al. 2013), social media (Xie et al. 2014), etc. Chen and Chung (2008) established Item Response Theory (IRT) and learning memory cycles where learners could achieve their highest learning efficiency by having the learning material cater to their vocabulary abilities and memory cycles. Chen and Li (2010) advance personalized context-aware ubiquitous learning systems in an attempt to adapt the learning content to learners’ locations, schedules, and abilities. Similarly, Huang et al. (2012) propose a ubiquitous English vocabulary learning system using video clips to allow learners to experience systematic word learning without time or space restrictions. Loucky (2012) suggests the pre-arrangement of the target vocabulary into bilingual categories with common semantic keywords in order to build a distance vocabulary learning system. Bulger (2016) builds a typology of technologically-enabled personalized learning systems along with five supporting categories: a customized learning interface, learning management, data-driven learning, adaptive learning, and an intelligent tutor. In terms of facilitating personalized word learning, Xie et al. (2016) discuss two kinds of profiling techniques—explicit user profiling and implicit user profiling, mainly focusing on the ownership of learners’ data and control of their vocabulary proficiency.

System architecture

Fig. 1
figure 1

The architecture of the personalized word learning system

With the development of connectivist pedagogy in recent years, connectivist approaches have been defined and applied to teaching and learning practices (Downes 2010; Siemens 2005). Learning is defined as “the process of building networks of information, contacts, and resources that are applied to real problems,” and “this pedagogical approach focuses on building and maintaining networked connections that are relevant, current and flexible enough to support student-centered learning” (McLoughlin 2013). Unlike teacher-directed curricula, student-centered learning is supported by personalized learning environments that enable “individuals to select, integrate and construct knowledge using various software, services, and options based on their needs and circumstance” (McLoughlin 2013). Ideally, such a model can lead to learning based on individual needs. The word learning process is also driven by individual needs (Chen and Li 2010). Therefore, in this section, we propose a detailed system architecture to facilitate personalized word learning.

As shown in Fig. 1, the generic system architecture of the proposed personalized word learning system can be divided into three components: user data collection, learner profiling, and personalized learning. The details of each component are introduced in the following subsections.

User data collection

Fig. 2
figure 2

The nested model for vocabulary (Zou et al. 2017a)

Learner profiling adopts the conventional vector form to represent the target words and their corresponding knowledge levels. Formally, a learner profile is denoted as

$$\begin{aligned} L_{i}=(w_{1}:\varepsilon ^{i}_{1};w_{2}:\varepsilon ^{i}_{2};...;w_{n}:\varepsilon ^{i}_{n}), \end{aligned}$$
(1)

where \(w_{x}\) is a target word, \(\varepsilon ^{i}_{x}\) is the knowledge level of learner i on word \(w_{x}\), and a value in the interval [0, 1] is used to represent \(\varepsilon ^{i}_{x}\) (Zou et al. 2017a). The value of \(\varepsilon ^{i}_{x}\) is calculated on the basis of the linear combination of the values obtained from explicit and implicit data acquisition as follows. In this proposed system, the integration of explicit data acquisition and implicit data acquisition is employed as proposed in an earlier study (Wang et al. 2018). As mentioned, explicit data acquisition relies on user input data to understand learners’ prior knowledge levels. In the context of word learning, the form of user input is to ask learners to indicate their prior word knowledge levels according to vocabulary knowledge scales (VKS) (Folse 2006) for selected words at different difficulty levels. Normally, 3-rating VKS is adopted, as it provides a good balance between accuracy and efficiency (Zou et al. 2017b). These words are organized in the form of word-nested models, as shown in Fig. 2. The nested model for vocabulary is essential to grouping words at different difficulty levels, and a word set \(A_{n}\) (\(1\le n \le k\)) is the set including all words at the difficulty level n. The difficulty levels of words can be obtained by using current software tools like TwinwordFootnote 1 or Frequent Level Checking (FLCFootnote 2). Note that learners can only input their prior knowledge levels for a limited number of words. For the remaining words, we use an explicit acquisition function (Zou et al. 2017a) to estimate the knowledge levels.

For implicit data acquisition in word learning systems, the typical data sources to be collected are historical learning logs and current learning data. We compared various kinds of historical learning data sources, including reading texts, writing assignments, and test papers, in a previous study (Zou et al. 2015). In this study, we found that test papers are the most accurate data source for constructing learner profiles. In addition, the integration of all three data sources ensures a more accurate construction of learner profiles than exploiting a single data source. Although there are several potential approaches (Maseleno et al. 2018; Dietz-Uhler and Hurn 2013) to exploiting other data sources in learner profiling, we still adopt the hybrid method to integrate the three data sources above for implicit data acquisition in the proposed system, as the focus of this study is to identify the optimal method for integrating implicit and explicit data acquisition. The main idea of implicit data acquisition is to adopt term-frequency and inverse document frequency (TF-IDF) (Jones 1972) to denote the weights of the words in the external documents. For a learning document \(d= \{w_{1}, w_{2},..., w_{d}\}\) and a set D of learning documents, the TF-IDF paradigm is adopted to measure the term weighting as follows:

$$\begin{aligned} rel(w_{j})=\frac{f(w_{j},d)}{max\{f(w,d):w\in d\}}\times log\frac{|D|}{|{d\in D:w_{j}\in d}|}, \end{aligned}$$
(2)

where the first component \(\frac{f(w_{j},d)}{max\{f(w,d):w\in d\}}\) is the term frequency (TF) and the other component \(log\frac{|D|}{|{d\in D:w_{j}\in d}|}\) is the IDF part, which represents the salience of a given word in this document (Wang et al. 2018; Zou et al. 2017a).

Learner profiling

Learner profiling adopts the conventional vector form to represent target words and their corresponding knowledge levels. Formally, a learner profile is denoted as follows:

$$\begin{aligned} L_{i}=(w_{1}:\varepsilon ^{i}_{1};w_{2}:\varepsilon ^{i}_{2};...;w_{n}:\varepsilon ^{i}_{n}), \end{aligned}$$
(3)

where \(w_{x}\) is a target word, \(\varepsilon ^{i}_{x}\) is the knowledge level of learner i on word \(w_{x}\), and a value in the interval [0, 1] is used to represent \(\varepsilon ^{i}_{x}\) (Zou et al. 2017a). The value of \(\varepsilon ^{i}_{x}\) is calculated on the basis of the linear combination of the values obtained from explicit and implicit data acquisition as follows:

$$\begin{aligned} \varepsilon ^{i}_{x} = \alpha \cdot \varepsilon ^{i}_{x,ex} + (1-\alpha ) \cdot \varepsilon ^{i}_{x,im}, \end{aligned}$$
(4)

where \(\varepsilon ^{i}_{x,ex}\) is the knowledge level obtained from explicit data acquisition of learner i, \(\varepsilon ^{i}_{x,im}\) is the implicit data acquisition level, and \(\alpha\) is a parameter to adjust the weights of these two values. In a previous study (Wang et al. 2018), a weight of 0.5 was used so that the explicit and implicit knowledge levels of the words were equally weighted. In this study, more weights will be tried and verified to identify the optimized combinations.

In addition to the integration of explicit and implicit knowledge levels of words, two kinds of updating methods, time-decayed update and feedback-driven update, are employed in learner profiling (Wang et al. 2018). As the retention of a word will decrease as time elapses, time-decayed update applies the idea of the Ebbinghaus forgetting curve (Wixted and Ebbesen 1997) and exploits a time-decayed function \(\varepsilon _{i}^{x}|t=e^{-t/\varepsilon _{x}^{i}}\) (where \(\varepsilon _{i}^{x}|t\) is the knowledge level without the review of word \(w_(x)\) after time t) (Wang et al. 2018). Meanwhile, feedback-driven update is a mechanism to adjust \(\varepsilon _{i}^{x}\) by considering learning achievements during the word learning processes of the proposed system. The main idea is to categorize the feedback results in four different cases and use a piecewise function to deal with all the cases, as introduced in Wang et al. (2018).

Personalized learning

The personalized learning component aims to offer a sequence of learning tasks according to the knowledge level of each learner. As the knowledge levels are reflected in the learner profile, we therefore recommend learning tasks according to the learner profiles obtained in the above subsection. As the focus is to investigate how to optimize the explicit and implicit knowledge levels in the learner profile, we decided to adopt a recommendation algorithm based on word coverage (Xie et al. 2016), which posits that a learning task should contain more target words unfamiliar to the learners. For a task t, the degree of unfamiliarity of this task can be defined as follows.

$$\begin{aligned} \theta (t,i) = \sum _{\forall w_{x}\in t}\varepsilon _{x}^{i}, \end{aligned}$$
(5)

where \(\theta (t,i)\) is the degree of unfamiliarity of learner i with the learning task t, and \(w_{x}\) is one of the target words in the learning task t. The recommended tasks are to maximize the degree of unfamiliarity as follows:

$$\begin{aligned} t^{*} = \arg \max _{t\in T}{\theta (t,i)}, \end{aligned}$$
(6)

where T is the set of learning tasks available to the word learning systems.

Fig. 3
figure 3

The overall learning processes in word learning systems

As shown in Fig. 3, the whole learning process can be divided into seven steps as follows:

  1. 1.

    The initial step is that the learner first inputs the prior knowledge levels for selected words provided by the system.

  2. 2.

    By incorporating external data sources, the learner profile is established in the system using both explicit and implicit data.

  3. 3.

    Two learning tasks are suggested by the system using word coverage recommendations, as mentioned in this subsection.

  4. 4.

    The learner picks and completes one learning task from two suggested tasks.

  5. 5.

    After completing the learning task, the system examines whether the whole learning process is completed.

  6. 6.

    If the learning processes are not completed, the system will provide feedback to the learner profile, and then go back to step 2.

  7. 7.

    If the learning processes are completed, the system will update the learner profile and terminate the learning processes.

Methodologies

In addition to 32 ESL university students who participated in the experiment of the previous study (Wang et al. 2018), 68 more university students were invited to participate in the further experimental study. There were thus a total of 100 ESL university students with English proficiency at the level of IELTS Band 5.0. We randomly sorted the students into five equal groups. Note that we had already conducted an experiment on two groups with 16 participants in each group in the previous study (Wang et al. 2018). The details of each group are introduced and summarized in Table 1.

Table 1 The five groups in the previous and current experiments
  • Control Group The control group employed only explicit learner profiles. In other words, their knowledge levels of vocabulary were obtained from their explicit specifications in the system. The explicit data about their prior knowledge levels were used as the final learner profile. In other words, weights of 1.00 and 0.00 for explicit and implicit learner profiles were used during learner profile integration, as introduced in Eq. (2). Four participants joined the control group, which thus included a total of 20 participants.

  • Experimental Group 1 Experimental Group 1 received different settings from the control group. Specifically, the weight values were slightly adjusted to 0.25 and 0.75, respectively, for explicit and implicit data. That is, Experimental Group 2 was more heavily weighted for explicit data (i.e., user-input prior knowledge levels) when constructing the learner profile. As this group did not participate in the previous experiment, 20 participants newly joined this group in this study.

  • Experimental Group 2 Experimental Group 2 was the experimental group in the previous study (Wang et al. 2018). In this group, equal weights, 0.50 and 0.50, were adopted for the explicit and implicit data for learner profiling, respectively (i.e., \(\alpha = 1-\alpha = 0.50\)). To maintain the equality of participant numbers of each group, four more participants were included in Experimental Group 2, so that 20 participants were included in this group.

  • Experimental Group 3 The settings for Experimental Group 3 were the converse of those of Experimental Group 1. In other words, the weights were set at 0.25 and 0.75. Thus, more weight was given to implicit data (i.e., the prior knowledge levels learned from learner assignments, exam papers, and so on) when constructing the learner profile. As this group did not participate in the previous experiment, 20 participants newly joined this group in this study.

  • Experimental Group 4 The settings of Experimental Group 4 were the converse of those for the control group. In other words, the weights were set as 0.00 and 1.00. The learner profile is thus built only on the basis of implicit data, with explicit data neglected. As this group did not participate in the previous experiment, 20 participants newly joined this group in this study.

Turning to the experimental procedures, a pre-test was conducted to ensure that the participants had the least knowledge of the 20 target words before the learning processes. The learners created an account on the word learning system and followed the seven steps of the learning process introduced in Sect. 3.3. The target words, learning tasks and marking criteria followed those of previous studies of vocabulary acquisition (Folse 2006; Zou 2017). The whole learning process lasted for two days, and each participant had to complete 10 learning tasks suggested by the system. Each learning task could be finished in a very short period of time about ten minutes. After completing the learning process, a post-test was conducted to examine learners’ immediate learning of the 20 target words within 30 min.

For both pre-test and post-test, we used the same test to evaluate learning effectiveness; a sample test paper is provided in Table 2 of the “Appendix”. The 3-rating vocabulary knowledge scale (Folse 2006) was adopted for the marking criteria. The target words were adapted from Zou’s research (2017). Specifically, (1) if the learner could not remember the word meaning, no score (0) would be given; (2) if the learner could remember the word meaning without knowing how to use it in context, a half score (0.5) would be given; and (3) if the learner could remember the word meaning and use the word in the correct context, a full score (1) would be given.

Results and discussion

Fig. 4
figure 4

The experimental results for the five groups

The experimental results are illustrated in Fig. 4. The green curve and orange bars present the same values (i.e., post-test results) in two different ways, while the blue bars show the pre-test results. The pre-test results of four groups are close to each other. We applied a significance test to verify that the differences between each two groups were not significant (\(t > 0.1\)). We also applied Student’s t-test to examine whether the differences between the two groups were significant, and found that all differences between any two groups in the post-test were significant (\(t < 0.05\)). Furthermore, we identified Experimental Group 1 as having the best performance on word retention, while Experimental Group 4 had the worst performance. Given that Experimental Group 3 integrated explicit and implicit data in learner profiling and the control group only employed the explicit data, the result we obtained of the control group outperforming Experimental Group 3 indicates that the integration of both data sources cannot always outperform a single explicit data source when establishing a learner profile.

Furthermore, we found that the curve reached its peak value at \(\alpha = 0.75\) and decreased with decreasing \(\alpha\), taking its minimum value when \(\alpha = 0.00\). These results show that the integration of “implicit data” with explicit data can improve the effectiveness of personalized word learning. However, such integration should be dominated by explicit data. In other words, the optimal method of integrating implicit and explicit data is to give more weight to explicit data (i.e., \(\alpha > 1-\alpha\)), while implicit data serves as a supplementary source during the integration. This result is consistent with the findings of a previous study (Xie et al. 2014) that the explicit data specified by users is more dominant, and a better quality of data can be generated if implicit data are added as supplements.

The implication of the results is that learners actually understand their own vocabulary proficiency better than “their test papers, assignments and so on” would reveal. From the perspective of the system, the design of personalized word learning systems needs explicit data on users’ prior knowledge levels as obtained through user input. However, the requisite manual efforts are time-consuming and infeasible for large amounts of data. Implicit data then serve as an important source of additional data compensating for this drawback. The designer of a personalized word learning system should pay more consideration to the balance of user-input and implicit data. From the perspective of word learning, university students have already shown that they can clearly understand their prior knowledge levels of the vocabulary in the experiments. In addition to personalized word learning systems, university ESL students are suggested to have their own “personalized learning plans,” including picking English readings with a larger vocabulary size than their own and rehearsing unfamiliar target words in a learning task.

Conclusion

In this article, we studied the system architecture of personalized word learning systems based on learner profiles and the optimal method for integrating implicit and explicit data sources to construct learner profiles. We introduced each component of the proposed word learning system and conducted experimental studies on different combinations of explicit and implicit data sources for learner profiling. The experimental results showed that the explicit data dominates, while implicit data sources can serve as supplements. In addition, we discussed the implications of this study from the perspectives of system design and word learning.

The limitations of this study are that the number of participants in each group was not large and the behavioral data during learning were not actually applied to adjust the learning process. In the future, we will continue investigating the research questions of how to minimize the effort of user input for the explicit data and of how to integrate the behavioral data to better facilitate personalized word learning.