1 Introduction

With the development of sophisticated e-learning environments which characterize the huge information, the strong interactivity, the great coverage and no space-time restrictions (Anane et al. 2004; Mallinson and Sewry 2004), personalization is becoming an important feature in e-learning systems. The large numbers of learners, the main users of such systems have differences in background, goals, capabilities and personalities. Personalized learning occurs when e-learning systems have design according to educational experiences that fit the needs, goals, talents, and interests of their learners. Personalization can be achieved using pre-defined rules that sequentially propose learning objects in a specified learning path (Koper and Olivier 2004). It is also achieved by using heuristic rules, user models and recommendation techniques (Resnick and Varian 1997).

Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user (Kantor et al. 2011). The suggestions relate to various decision-making processes such as what items to buy, what music to listen, what online news to read or what learning objects to learn. “Item” is the general term, which can use to denote what the system recommends to users. A RS normally focuses on a specific type of item (e.g. CDs, news, or learning objects). Accordingly, its design, its graphical user interface, and the core recommendation technique used to generate the recommendations provide useful and effective suggestions for that specific type of item. RSs are primarily directed towards individuals who lack sufficient personal experience or competence to evaluate the potentially overwhelming number of alternative items that a web site, for example, may offer (Prasad and Kumari 2012). Recommender systems are an extensively studied and well established field of research and application (Adomavicius and Tuzhilin 2005). Major search engines like Google and electronic shops like Amazon have incorporated recommendation technology in their services in order to personalize their results. Unfortunately, the algorithms underlying regular recommender systems are not directly transferable to the area of e-learning. When comparing learning content to movies or books—the cognitive state of the learner and the learning content may change over time and context (Núñez -Valdéz et al. 2012). MovieLens recommendations are entirely based on the interests and the tastes of the user, whereas preferred learning activities by the learners might not be pedagogically most adequate (Cosley et al. 2003). Even for learners with the same interest and same taste, we may need to recommend different learning activities, depending on individual proficiency levels, learning goals and context. For instance, learners with no prior knowledge in a specific domain should be advised to study basic learning material first, where more advanced learners should be advised to continue with more specific materials.

Researchers have investigated various recommendation techniques in order to suggest online learning activities or optimum browsing pathways to learners, based on their preferences, knowledge (Lytras and Ordóñez de Pablos 2011) and the browsing history of other learners with similar characteristics (Dráždilová et al. 2010). Ideally, Recommender System (RS) in e-learning environments should assist learners in discovering relevant learning actions that perfectly match their profile, at the right time, in the right context, and in the right way, keep them motivated and enable them to complete their learning activities in an effective and efficient way (Tang and McCalla 2005).

A RS in e-learning environments introduce new challenges compared to standard applications. The main difference is that each learner uses her own tools, methods, paths, collaborations and processes. Consequently, guidance within the learning process must personalize to an extreme extent. Such a RS utilizes information about learners and learning activities (LA) and recommend items such as papers, web pages, courses, lessons and other learning resources which meet the pedagogical characteristics and interests of learners (Drachsler et al. 2008). It could provide recommendations to online learning materials or shortcuts. For example, rather than recommending resources that other users with similar interests have used, those recommendations are based on previous learners’ activities or on the learning styles of the learners that are discovered from their navigation patterns. To design an effective RS in e-learning environments, it is important to understand specific learners’ characteristics (García et al. 2009; Drachsler et al. 2008): learning goal, prior knowledge, learner characteristics, learner grouping, rated learning activities (LAs), learning paths, and learning strategies, desired in a RS. E-learning systems should be able to recognize and exploit these learners’ characteristics serve as guidelines for framework design and platform implementation for a good RS for e-learning (Angehrn et al. 2001; Zaïane 2001; Savidis et al. 2006). A good RS should be: highly personalized, recommend materials at the appropriate time and location, support non-disruptive view of experience, socially situated, include the adoption phase, support the continuous learning process and provide high level of interactivity.

In this introductory section we provide a short overview of several different techniques adopted to e-learning domain in a few recent years in order to demonstrate the potential of their importance in the current development and introducing of the reader who needs to keep abreast of innovations. Verbert et al. (2012) discussed the importance of the contextual information in the recommendation process. The contextual information refers to learning environment, location, time, physical conditions, activity, resources and social relations (Staikopoulos et al. 2014). Results of the survey indicate that there has been much advancement in the development of context-aware e-learning recommenders in recent years. Soonthornphisaj et al. (2006) introduce the concept of global e-learning using web service. The idea is to extend the e-learning system from local learners to global learners. Each e-learning website administrator must register to be the member of the recommender system web service (a database of materials in order to do the collaboration filtering process). The benefit includes availability of wide variety of material for the learner across the local boundaries. Lu (2004) proposed a framework of a personalized learning recommender system (PLRS), which aims to help learners find learning materials they would actually require to study. The significance of lies in the fact that learner“s individual needs are fulfilled aiming to improve not only the career but personal life as well. The proposed framework by Lu can apply easily to online teaching and learning sites. The information required for PLRS is learning material database and learner“s personal information. The learners’ requirement can obtain and match with the existing learning material by using the computational analysis model. The proposed PLRS consists of four components: Getting learners’ information, Learner requirement identification, Learning material matching analysis and learning recommendation generation. The PLRS has several advantages including like handling sparsity problem, preventing false positive errors and providing more accuracy in recommending the appropriate learning material. Ghauth and Abdullah (2010) illuminate the value of user rating as a collaboration tool in helping other learners by suggesting suitable items. The system promotes collaboration of learners to help each other during the learning process. The “good learner rating” feature uses in which the learners who studied the learning material and obtained more than 80 % in the post-test can give the rating to that learning material. Learning outcomes of several groups of learners who used different e-learning recommendation systems is compared. The outcome revealed that inclusion of good learners“ ratings in the content-based RS significantly improves the performance of the learners. Here we can get a good estimate of the knowledge gained by the learner.

In this paper, a comprehensive and systematic study of recommender systems for e-learning environments is carried out. The intention is twofold:

  • to provide an overview of state-of-the-art systems, by highlighting the techniques which revealed the most effective e-learning application domains;

  • to present model for tagging activities and tag-based recommender systems, which can apply to e-learning environments, as new trends and directions for future research which might lead towards the next generation of recommender systems.

For these purposes, we analyze specific challenges, requirements for designing and (dis)advantages of current recommendation techniques and their usefulness for Personalized RS (PRS) in e-learning environments. In contrast to earlier surveys on recommender systems for learning, this article focuses specifically on collaborative tagging systems that can be used for extending the capabilities of RSs and to generate the better recommendations. Colaborative tagging systems have grown in popularity over the Web in the last years based on their simplicity to categorize and retrieve content using open-ended tags. The increasing number of users providing information about themselves through social tagging activities caused the emergence of tag-based profiling approaches, which assume that users expose their preferences for certain contents through tag assignments (Gayo et al. 2010).

Tag-based RS in eLearning could support learners in their own learning path by recommending tags and learning resources, and also could promote the learning performance of individual learners. These systems can use different recommendation techniques in order to suggest online learning activities or optimal browsing pathways to students, based on their preferences, learning style, knowledge level and the browsing history of other students with similar characteristics.

The systems can be distinguished according to what kind of resources are supported. Flickr,Footnote 1 for instance, allows the sharing of photos, del.icio.usFootnote 2 the sharing of bookmarks, CiteULikeFootnote 3 and ConnoteaFootnote 4 the sharing of bibliographic references, and 43ThingsFootnote 5 even the sharing of goals in private life. These systems are all very similar. Once a user logs in, she/he can add a resource to the system, and assign arbitrary tags to it. Besides helping user to organize his/her personal collections, a tag also can be regarded as a user’s or expert’s personal opinion expression, while tagging can be considered as implicit rating or voting on the tagged information resources or items (Liang et al. 2008). Thus, the tagging information can use to make recommendations. Tag-based RS in e-learning environments could support learners in their own learning path by recommending tags and learning resources, and could promote the learning performance of individual learners (Kim 2011; Anjorin et al. 2011).The paper is organized as follows. The most important requirements and challenges for designing a recommender system in e-learning environments are presented in Sect. 2. Section 3 presents a survey of the state-of-the-art in recommendation techniques for RS in e-learning environments. Section 4 presents a model for tagging activities and tag-based recommender systems which can be applied to e-learning environments. Section 5 concludes this paper.

2 The most important requirements and challenges for designing a recommender system in e-learning environments

Personalized recommendation approaches are first proposed in e-commerce area for product purchase (Balabanović and Shoham 1997; Resnick and Varian 1997), which help consumers find products they would like to purchase by creating a list of recommended products for each given consumer (Cheung et al. 2003; Schafer et al. 2001).

RSs strongly depend on the context or domain they operate in, and it is often not possible to take a recommendation strategy (Drachsler et al. 2008) from one context and transfer it to another context or domain. The first challenge for designing a RS is to define the users and purpose of specific context or domain in a proper way (McNee et al. 2006). Learning process includes three components: learners, teachers/instructors, and learning materials. From teacher’s point of view, teaching is an activity to deliver information and skill to learners with some goals to be achieved. From learners’ point of view, learning is an activity to acquire information from teacher to achieve goals set by the teacher.

In a virtual classroom, teachers provide resources such as text, multimedia and simulations, and moderate and animate discussions. Remote learners are encouraged to peruse the resources and participate in activities. However, it is very difficult and time consuming for educators to thoroughly track and assess all the activities performed by all learners on all tools. Moreover, it is hard to evaluate the structure of the learning content and its effectiveness on the learning process. Resource providers do their best to structure the content assuming its efficacy (Zaïane 2001). When instructors put together an on-line course, they may compile interactive course notes, simulations, demos, exercises, quizzes, asynchronous forums, chat tools, web resources, etc. This combination of on-line hyperlinked material could form a complex structure that is difficult to navigate. Hence, personalization features are needed which adaptively facilitate learner in monitoring their learning progress and provide any resources or learning material that suitable to what they need (Guo and Zhang 2009).

Literature review shows there are many researchers have attempted to adopt recommender systems to e-learning environments. For example, Shen and Shen (2005) and Vesin et al. (2013) described a mechanism focused on how to organize the learning materials based on domain ontology, which can guide the learning resources recommendation according to learning status. A multi-attribute assessment method proposed in Lu (2004) justifies a learner’s need and deploys a fuzzy matching method to find suitable learning contents to best perform each learner need. Research paper (Luo et al. 2002) presented a method to organize components and courseware using the hierarchy and association rules of the concepts, which can recommend the relative contents to learners and can help them to control the learning schedule. However, most of these methods missing one important issue in e-learning RS, that is, the natural learning behavior is not lonely but interactive which relying on friends, classmates, lecturers, and other sources to make the choices for learning. Instructors are in desperate need for non-intrusive and automatic ways to get objective feedback from learners in order to better follow the learning process and appraise the on-line course structure effectiveness. On the learner’s side, it would be very useful if the system could automatically guide the learner’s activities and intelligently recommend on-line activities or resources that would favor and improve the learning. The automatic recommendation could use the teacher’s intended sequence of navigation in the course material, or, more interestingly, could use navigation patterns of other successful learners. For example, during the learning process, a learner read a useful material, summarized what he/she has learned or got the answer of a typical question, some learners with similar learning status are likely need these resources.

A RS in e-learning environments utilizes information about learners and learning activities (LA) and recommend items such as papers, web pages, courses, lessons and other learning resources which meet the pedagogical characteristics and interests of learners (Drachsler et al. 2008). Such a RS could provide recommendations to online learning materials or shortcuts. Those recommendations are based on previous learners’ activities or on the learning styles of the learners that are discovered from their navigation patterns. To design an effective RS in e-learning environments, it is important to understand specific learners’ characteristics (García et al. 2009; Chen et al. 2014):

  • learning goal,

  • prior knowledge,

  • learner characteristics,

  • learner grouping,

  • rated learning activities (LAs),

  • learning paths, and

  • learning strategies, desired in a RS.

E-learning systems should be able to recognize and exploit these learners’ characteristics serve as guidelines for framework design and platform implementation for a good RS for e-learning.

  • A good RS should be highly personalized. Relevant learning materials should be chosen and presented to learners or researchers based on learner’s learning style, interests, preferences, current activities, etc. (Sampson et al. 2010).

  • A good RS should recommend materials at the appropriate time and location. A good RS should deliver relevant learning materials to learner at the most appropriate time and locations to facilitate learners’ acquisition of knowledge and skills (Angehrn et al. 2001).

  • A good RS should support non-disruptive view of experience. ‘Non-disruptive‘ means that learners have the option to either follow or discount relevant materials based on their learning needs (DeRouin et al. 2004).

  • A good RS should be socially situated. A good RS should be able to recognize and exploit the learners’ social networks, role models, levels of trust and influence, etc. RS should also help the learners to recognize their knowledge acquisition process in the context of the group (Zaïane 2001).

  • A good RS should include the adoption phase. A good RS should be able to monitor, understand and model the different phases of adoption of the knowledge by the learner. In particular it includes the phases in which the new concepts are experimented with, evaluated, internalized and finally applied (Savidis et al. 2006).

  • A good RS should support the continuous learning process. A good RS should support just-in-time learning, by better analyzing their current and future activities. In addition, it should provide motivational support and stimulation (Tippins and Sohi 2003).

  • A good RS should provide high level of interactivity. A good RS should provide very active, cognitive and diverse mode of interaction with the learner in the form of a rich choice of interaction strategies (Tippins and Sohi 2003).

  • A good RS should provide appropriate course materials according to learners’ learning style. Each person learns differently and needs to develop his/her own learning skills in his/her own way. Learners have different backgrounds, strengths and weaknesses, interests, ambitions, senses of responsibility, levels of motivation, and approaches to studying and learning. For example, different learners prefer different presentation forms: some prefer multimedia contents (simulations, presentations, graphical material and hypertext documents); while others prefer traditional web pages (questionnaires, exercises, research studies) (Cox and Tsai 2013).

Fig. 1
figure 1

Recommendation techniques for RS in e-learning environments

3 Recommendation techniques for RS in e-learning environments: a survey of the state-of-the-art

E-learning system uses different recommendation techniques in order to suggest online learning activities to learners, based on their preferences, knowledge and the browsing history of other learners with similar characteristics. RSs assist the natural process of relying on friends, classmates, lecturers, and other sources of make the choices for learning (Lu 2004).

Each recommendation strategy has its own strengths and weaknesses. According to set of the most important requirements for a good RS in e-learning environment, have been explored and defined in the previous section, in the remainder of this section we present a survey of the state-of-the-art in RS for e-learning environment. We identify challenges and various limitations for each traditional recommendation method, and then consider some tag-based profiling approaches for extending their capabilities. In order to facilitate the following material that we will display in the rest of the section, we use diagram that lists the surveyed research areas. This hierarchical structure includes all representative research examples (Fig. 1).

3.1 Matrix and tensor factorization methods

Most of the works about recommendation techniques in e-learning focused on constructing recommender systems for recommending learning objects (materials/resources) or learning activities to the learners (Ghauth and Abdullah 2010; Khribi et al. 2015) in both formal and informal learning environment (Drachsler et al. 2008).

On the other side, educational data mining has taken into account to support universities, teachers, and learners. For example, to help the learners improve their performance, we would like to know how the learners learn (e.g. generally or narrowly), how quickly or slowly they adapt to new problems or if it is possible to infer the knowledge requirements to solve the problems directly from learner performance data (Feng et al. 2009). Many universities extremely focus on assessment, thus, the pressure on “teaching and learning for examinations” leads to a significant amount of time spending for preparing and taking standardized tests. Any advances, which allow us to reduce the role of standardized tests, hold the promise of increasing deep learning (Feng et al. 2009). From an educational data-mining point of view, a good model that accurately predicts learner performance could replace some current standardized tests. Many works have published to address the problem of learner performance prediction. Most of them relying on traditional methods such as logistic regression (Cen et al. 2006), collaborative filtering (Pero and Horváth 2015). linear regression (Feng et al. 2009), decision tree (Thai-Nghe et al. 2007), neural networks (Romero and Ventura 2006), support vector machines (Thai-Nghe et al. 2009), and so on.

Thai-Nghe et al. (2011a) have proposed using recommendation techniques, especially matrix factorization, for predicting student performance (PSP). The authors have shown that using recommendation techniques could improve prediction results compared to regression methods.

Overall, in student performance prediction, there are two “crucial aspects” need to be considered, which are (Thai-Nghe et al. 2011a):

  1. 1.

    The probabilities that the students do not know how to solve the problem (or do not know some required skills related to the problem) but guessing correctly (it is called “guess” for short); and the probabilities that the students know how to solve the problem (or know all of the required skills related to the problem) but they make a mistake (it is called “slip” for short); these problems are user-dependent, and,

  2. 2.

    Their knowledge has been improved over the time, e.g. the second time a student is doing his exercises, his performance on average gets better, therefore, the sequence effect is important information.

Matrix factorization techniques, one of the most successful methods for rating prediction, are appropriate for the first problem since they implicitly encode the “slip” and “guess” factors in their latent factors (Pardos and Heffernan 2010). Moreover, if we would like to incorporate the sequential (time) aspect or any other context such as skills in the second “crucial aspect”, then tensor factorization techniques are suitable for solving this problem (Thai-Nghe et al. 2011a).

The remainder of this chapter will proceed to show, how rating prediction can be mapped to predicting student performance. We first present the problem of predicting student performance, and then summarize the standard matrix and tensor factorization techniques.

3.1.1 Predicting student performance (PSP)

The problem of predicting student performance is to predict the likely performance of the student for some exercises (or part of such as for some particular steps) which we call tasks. The task could use to solve a particular step in a problem, to solve a whole problem or to solve problems in a section or unit, etc. (Thai-Nghe et al. 2011a).

Let S be a set of students, I a set of tasks, and \(P\subseteq R^{+}\) a range of possible performance scores. Let \(D^{train}\subseteq \left( {S\times I\times P} \right) \) and \(D^{test}\subseteq \left( {S\times I\times P} \right) \) be the observed and unobserved student performances, respectively. Finally, let

$$\begin{aligned}&\pi _P :S\times I\times P\rightarrow P,\quad (s,i,p)\rightarrow p\quad and \\&\pi _{s,i} :S\times I\times P\rightarrow S\times I,\quad (s,i,p)\rightarrow (s,i) \end{aligned}$$

be the projections to the performance measure and to the student-task pair. Then the problem of student performance prediction is, given \(D^{train}\) and \(\pi _{s,i} (D^{test})\) (in certain cases, also given the meta-data about the students and the tasks), to find

$$\begin{aligned} \hat{{p}}=\hat{{p}}_1 ,\hat{{p}}_2 ,\ldots ,\hat{{p}}_{\left| {D^{test}} \right| } , \end{aligned}$$

Matrix factorization is the task of approximating a matrix X by the product of two smaller matrices W and H, i.e. \(X\approx WH^{T}\) (Koren et al. 2009). In the context of recommender systems the matrix X is the partially observed ratings matrix, \(W\in R^{U\times K}\) is a matrix where each row u is a vector containing the K latent factors describing the user u and \(H\in R^{I\times K}\) is a matrix where each row i is a vector containing the K factors describing the item i. Let \(w_{uk}\) and \(h_{ik}\) be the elements of W and H, respectively, then the rating given by a user u to an item i is predicted by:

$$\begin{aligned} \hat{{r}}_{ui} =\sum _{k=1}^K {w_{uk} h_{ik} } =({ WH}^{T})_{u,i} \end{aligned}$$

where W and H are the model parameters and can be learned by optimizing the objective function given a criterion such as root mean squared error (RMSE):

$$\begin{aligned} \mathop {\min }\limits _{W,H} (r_{ui} -\hat{{r}}_{ui} )^{2}+\lambda (\left\| W \right\| ^{2}+\left\| H \right\| ^{2}) \end{aligned}$$

where \(\lambda \) is a regularization term, which uses to prevent overfitting. The model parameters can be optimized for RMSE using stochastic gradient descent (Bottou 2004):

$$\begin{aligned} { RMSE}=\sqrt{\frac{\sum \nolimits _{ui \in D^{test}} {(r_{ui} -\hat{{r}}_{ui} )^{2}} }{\left| {D^{test}} \right| }} \end{aligned}$$

Figure 2 presents an illustration of matrix decomposition.

Fig. 2
figure 2

An illustration of matrix factorization

Tensor factorization is a general form of matrix factorization. Given a three-mode tensor Z of size \(U\times I\times T\), where the first and the second mode describe the user and item as in previous section; the third mode describes the context, e.g. time, with size T. Then Z can be written as a sum of rank-1 tensors:

$$\begin{aligned} Z\approx \sum _{k=1}^K {w_k \circ h_k \circ q_k } \end{aligned}$$

where \(\circ \) is the outer product and each vector \(w_k \in R^{U},\;h_k \in R^{I}\,\,and\,\;q_k \in R^{T}\) describes the latent factors of user, item, and time, respectively (Kolda and Bader 2009; Dunlavy et al. 2011). An illustration of tensor decomposition is presented in Fig. 3. The model parameters were also optimized for RMSE using stochastic gradient descent.

Fig. 3
figure 3

An illustration of tensor factorization

3.1.2 Mapping educational data to recommender systems

In traditional recommender systems settings, it is unambiguous how the available information can map to users, items, and ratings, respectively. At least for all major recommender system data sets used (Movielens,Footnote 6 and NetflixFootnote 7) there is a unique assignment.

There is an obvious mapping of users and ratings in the student performance prediction problem:

$$\begin{aligned}&student\rightarrow user \\&correct first\;attempt\rightarrow rating \end{aligned}$$

The student becomes the user, and the correct first attempt (CFA) indicator would become the rating, bounded between 0 and 1. With this setting there are no users in the test set that are not present in the training set which simplifies predictions. For mapping the item, two options seemed to be suitable (Thai-Nghe et al. 2009):

  1. (1)

    Solving-step: a combination of problem hierarchy, problem name, step name, and problem view and

  2. (2)

    Skills (knowledge components)—information about the number of users, items, and ratings.

3.1.3 Matrix factorization: implicitly encoding the “slip” and “guess” factors

The biased matrix factorization model can be employed to solve the problem of “user effect” and “item effect”. On the educational setting the user and item bias are, respectively, the student and solving-step biases. They model how good a student is (i.e. how likely is the student to perform a task correctly) and how difficult the solving-step is (i.e. how likely is the step in general to be performed correctly).

The prediction function for user u and item i is determined by Thai-Nghe et al. (2011a)

$$\begin{aligned} \hat{{r}}_{ui} =\mu +b_u +b_i +\sum _{k=1}^K {w_{uk} h_{ik} } \end{aligned}$$

where \(\mu ,b_u \,and\, b_i \) are global average, user bias and item bias, respectively.

3.1.4 Tensor factorization for exploring the temporal effect

The knowledge of the students cumulates over the time, thus the temporal effect is an important factor to predict the student performance. Thai-Nghe et al. (2011a) adapted the idea from Dunlavy et al. (2011) which applies tensor factorization for link prediction. Instead of using only two-mode tensor (a matrix) as in the previous section, it can be added one more mode to the models—the time mode. In addition, it can be taken into account the “user bias” and “item bias”. The prediction function now becomes:

$$\begin{aligned} \hat{{r}}_{uiT}= & {} \mu +b_u +b_i +\left( \sum _{k=1}^K {w_{uk} h_{ik} \Phi _{Tk} } \right) \\ \Phi _{Tk}= & {} \frac{\sum {\left( {T-T_{\max } +1} \right) q_{tk} } }{T_{\max } } \end{aligned}$$

where \(q_k \) is a latent factor vector representing the time, and \(T_{\max }\) is the number of solving-steps in the history that we want to go back. This is a simple strategy, which averages \(T_{\max } \) performance steps in the past to predict the current step. This approach is called TFA (Tensor Factorization-Averaging) (Thai-Nghe et al. 2011a).

Another important factor is that “memory of human is limited”, so the students could forget what they have studied in the past, e.g., they could perform better on the lessons they learn recently than the one they learn last year or even longer. Moreover, it can recognize that the second time the students do their exercises and have more chances to learn the skills, their performance on average gets better. Thus, it could be used a decay function which reduces the weight \(\theta \) when we go back to the history. This approach is called TFW (Tensor Factorization-Weighting) (Thai-Nghe et al. 2011a).

$$\begin{aligned} \Phi _{Tk} =\frac{\sum \nolimits _{t=(T-T_{\max } +1)}^T {q_{tk} e^{t-\theta }} }{T_{\max } } \end{aligned}$$

An open issue for this is that we can use forecasting techniques instead of weighting or averaging. This solution could be analyzed in the future work.

In this section, we have presented how recommender systems (e.g. factorization techniques) can be applied for educational performance data, especially for predicting student performance but not for recommending learning objects. The delivered recommendations depend on the goal of the system (Thai-Nghe et al. 2011a):

  1. 1.

    In case of rating prediction, we can analyze student’s performance for a given item. It means that we can determine the expected performance on all tasks not solved by the student in the past. Then, we have several choices: we can offer the student tasks she will be more likely to solve successfully or we can let the teacher decide about the appropriate combination of tasks or simply present the prediction as a measure of estimated difficulty of that task for the student at hand.

  2. 2.

    If we are most interested in those items the students is expected to have problems with and which we therefore want to present to him for further learning (item prediction is the recommender systems task at hand), we compute the expected score for each item the student has not attempted in the past. We sort these items according to the predicted score and provide the student the top-k scored items as those have the highest probability of him failing this task.

  3. 3.

    We can consider classic problems such as recommending the student the top-k learning resources (courses, materials, tasks) he/she would be more likely interested in.

As shown above, matrix factorization can be used to implicitly take into account two latent factors “slip” and “guess” in predicting student performance. Moreover, the knowledge of the learners improve time by time, thus, tensor factorization methods can consider the temporal effect. In future work, instead of using averaging or weighting approached on the third mode of tensor, it could be used forecasting approach to take into account the sequential effect (Rendle 2011). Moreover, each solving-step relates to one or many skills, thus, it could be applied multi-relational matrix factorization to exceed this problem (Thai-Nghe et al. 2011b).

3.2 Collaborative filtering approach

Collaborative systems track past actions of a group of learners to make a recommendation for individual members of the group (Tan et al. 2008). Based on the assumption that learners with similar past behaviors (rating, browsing, or learning path) have similar interests, a collaborative filtering system recommends learning objects the neighbors of the given learner have liked.

This approach relies on a historic record of all learner interests such as can be inferred from their ratings of the items (learning objects/learning actions) on a website. Rating can be explicit (explicit ratings or customer satisfaction questionnaires) or implicit (from the studying patterns or click-stream behavior of the learners). The proportion of actual studying hours to the total hours of the course is recorded as the implicit rating scores, and transformed to corresponding explicit rating scores, from 1 to 5. The learners’ rating scores can be given in a m*n matrix, as it is shown in Table 1, where \(L=\left\{ {l_1 ,l_2 ,\ldots ,l_m } \right\} \) is a list of m learners, \(O=\left\{ {o_1 ,o_2 ,\ldots ,o_n } \right\} \) is the list of n learning objects, and \(R_{j,k} \) gives the rating of object \(O_k \), given by learner j. In addition, it can be rating of object \(o_k \) given by intelligent tutoring system for learner j. There exists a distinguished learner \(l_a \in L\) called the active learner for whom the task of collaborative filtering algorithm is to find learning object likeliness.

Table 1 Learner’s rating matrix

The Neighborhood formation scheme usually uses Pearson correlation or cosine similarity as a measure of proximity (Shardanand and Maes 1995; Resnick and Varian 1997). An exploratory study of a recommender system, using collaborative filtering to support (virtual) learners in a learning network, has reported in Koper and Olivier (2004). The authors simulated rules for increasing/decreasing motivation and some other disorder factors in learning networks, using the Netlogo tool. Closely related to this study is an experiment, reported in Janssen et al. (2007). The authors offered learners a similar recommendation system. The recommendations did not take personal characteristics of learners (or possible ‘matching errors’) into account. Another system implemented by Soonthornphisaj et al. (2006) allows all learners to collaborate their expertise in order to predict the most suitable learning materials to each learner. This smart e-learning system applies the collaborative filtering approach that has an ability to predict the most suitable documents to the learner. All learners have the chance to introduce new material by uploading the documents to the server or pointing out the Web link from the Internet and rate the currently available materials.

One of the first attempts to develop a collaborative filtering system for learning resources has been the Altered Vista (AV) system (Recker and Walker 2003; Recker et al. 2003; Walker et al. 2004). The AV system (Walker et al. 2004) uses a database in which learner evaluations of learning resources are stored. Learners can browse the reviews of others and can get personalized learning resource recommendations from the system. AV does not aim to support learners directly by giving them feedback on their work. Instead, AV provides an indirect learning support in which recommends suitable learning tools. The team working on AV explored several relevant issues, such as the development of non-authoritative metadata to store learner-provided evaluations (Recker and Walker 2003), the design of the system and the review scheme. It uses (Walker et al. 2004), as well as results from pilot and empirical studies from using the system to recommend to the members of a community both interesting resources and people with similar tastes and beliefs. A survey-based evaluation of AV showed a predominant positive feedback, but also identified issues with the system’s incentive and with regard to privacy (Walker et al. 2004).

Another system of the educational collaborative filtering applications is the Web-based PeerGrader (PG) (Gehringer 2001; Lynch et al. 2006). The purpose of this tool is to help learners improve their skills by reviewing and evaluating solutions of their fellow learners blindly. PG works in the following way: first, the learners get a task list and each learner chooses a task. Next, the learners submit their solutions to the system. Another learner can read these solutions and provides feedback in form of textual comments. After that, the authors modify their solutions based on the comments they have received, and re-submit their modified solutions again to the system, where other learners can review them. Then, the solutions’ authors grade each review with respect to whether it was helpful or not. Finally, the system calculates grades for all learner solutions. One of PG’s strengths is to provide learners with high-quality feedback also in ill-defined homework tasks that do not have clear-cut gold standard solutions (such as design problems). This kind of feedback could not be generated automatically. A disadvantage is the time required for the system to work effectively: due to the complexity of the reviewing process and the textual comments, the evaluation of a single learner answer is very time consuming. This may cause learner dropouts and deadline problems (Lynch et al. 2006). In addition, studies with PG revealed problems with getting feedback of high quality. An evaluation of subjective usefulness showed that the system was appreciated by its users (Lynch et al. 2006), yet a systematic comparison of PG scores to expert grades has not been conducted.

A newer web-based collaborative filtering system, the Scaffolded Writing and Rewriting in the Discipline (SWoRD) system (Cho and Schunn 2007; Cho et al. 2006) addresses the problem of writing homework in the form of a long text, which cannot be reviewed in detail by a teacher for time reasons. Because of this, learners do often not receive any detailed feedback on their solutions at all. Having such feedback would be beneficial for learners though, since they could use it to improve their future work. To address this problem, SWoRD relies on peer reviews. Students who conduct peer reviews read possible task solutions that were provided by other students, and they have to evaluate them. They thus be required to agreement with wrong or excellent solutions of others (and learn the skill of evaluating), and for more open tasks they have the opportunity to learn about different possible points of view. Based on these peer reviews, the system could then give feedback to learners e.g., by recommending good answers to students who seem to have problems with a certain task. Such an approach has potential for effectively relieving the workload of teachers and tutors, and can at the same time help students develop evaluation skills that they not often learn in formal education. An estimation showed that the participants benefitted from multi-peers’ feedback more than from single-peer’s or single expert’s feedback (Cho and Schunn 2007).

A different approach is used by the LARGO system (Pinkwart et al. 2006), where learners create graphs of US Supreme Court oral arguments. Within LARGO, collaborative scoring is employed to assess the quality of a “decision rule” that a learner has included in his diagram. Since this assessment involves interpretation of legal argument in textual form, it cannot be automated reasonably. While the overall LARGO system has been tested in law schools and shown to help lower-aptitude learners (Pinkwart et al. 2007), empirical studies to test the educational effectiveness of the specific collaborative scoring components have not been conducted.

Rule-Applying Collaborative Filtering (RACOFI) Composer system (Anderson et al. 2003; Lemire 2005; Lemire et al. 2005) combines two recommendation approaches by integrating a collaborative filtering engine, that works with ratings that learners provide for learning resources, with an inference rule engine that is mining association rules between the learning resources and using them for recommendation. RACOFI studies have not yet assessed the pedagogical value of the recommender, nor do they report some evaluation of the system by learners.

Manouselis and Costopoulou (2007) tried a typical, neighborhood-based set of collaborative filtering algorithms in order to support learning object recommendation. The examined algorithms have been multiattribute ones, allowing the recommendation service to consider multi-dimensional ratings that learners provide on learning resources. The performance of the same algorithms is changing, depending on the context where testing takes place. The results from the comparative study of the same algorithms in an e-commerce and a e-learning setting Manouselis and Costopoulou (2007) have led to the selection of different algorithms from the same set of candidate ones.

In summary, the relatively few educational technology systems with collaborative filtering components all have an underlying algorithm to determine solution quality based on collaborative scoring. Yet, existing systems are often specialized for a particular application area such as legal argumentation (LARGO), writing skills training (SWoRD), or educational resource recommendation (AV), or they involve a rather complicated and longterm review process (SWoRD, PG).

The CF-based techniques, in general, suffer from several limitations. Two serious limitations with quality evaluation are the sparsity problem and the cold start problem (Lu 2004). The sparsity problem occurs when available data is insufficient for identifying similar learners or items (neighbors) due to an immense amount of learners and items (Sarwar et al. 1998). It is difficult for collaborative filtering based recommender systems to precisely compute the neighborhood and identify the learning objects for recommendation even though learners are very active, each individual has only expressed a rating on a very small portion of the items (Linden et al. 2003). Also, an severe problem is the cold start problem (first-rater), which occurs when a new learner/learner object is introduced and thus has no previous ratings information available (Massa and Avesani 2004). With this situation, the system is generally unable to make high quality recommendations.

The CF-based techniques rely heavily on explicit learner input (e.g., previous customers’ rating/ranking of products), which is either unavailable or considered intrusive. With sparsity of such learner input, the recommendation precision and quality drop significantly (Lops et al. 2013). This is because without good and trusted ratings entered by the learners, recommendations become useless and untrustworthy. To recommend learning activities or learning objects it is better to use real past activities (history logs) by learners as input for their profiles (Šimić 2004). In addition, in the case of intelligent tutoring system, collaborative filtering (CF) approach can be carried out according to ratings (grades) for learners’ knowledge level, provided by the tutoring system (Klašnja-Milićević et al. 2011).

3.3 Content-based techniques

Content-based techniques recommend items (learning objects/learning actions) similar to the ones the learners preferred in the past. They base their recommendations on individual information and ignore contributions from other learners (Billsus and Pazzani 1998). In content-based systems, items are described by a common set of attributes. Learner’s preferences are predicted by considering the association between the item ratings and the corresponding item attributes. Therefore, learner can receive proper recommendations without help from other learners. Content-based techniques can be classified into two different categories (Schmitt and Bergmann 1999; Aguzzoli et al. 2002; Wilson et al. 2003):

  1. 1.

    Case based reasoning (CBR) techniques and,

  2. 2.

    Attribute-based techniques

Case based reasoning (CBR) techniques recommend items with the highest correlation to items the learner liked before. Case-based reasoning is useful to keep the learner informed about aimed learning goals. These techniques are domain-independent, do not require content analysis and the quality of the recommendation improves over time when the learners have rated more items. The disadvantage of the new learner problem also states to case-based reasoning techniques. Nevertheless, specific disadvantages of case-based reasoning are overspecialization and sparsity, because only items that are highly correlated with the learner profile or interest can be recommended. Through case-based reasoning, the learner is limited to a set of items that are similar to the items she/he already knows (Adomavicius and Tuzhilin 2005).

Recent research papers present different facets of CBR in teaching or learning help. Pixed (Project Integrating eXperience in Distance Learning), which is an adaptive hypermedia ontology-based system implements case based reasoning method (Heraud et al. 2004). The Pixed approach assumes positions of a learner as a kind of expert of her/his own learning skills, or at least as a real practitioner of her/his own practices. The learner builds her/his knowledge by interacting with the learning environment, trying to benefit as much as possible from the available educational activities. Learning is considered as a problem-solving task. The goal is to learn a specific concept proposed in the domain knowledge ontology. The way to reach this goal is one particular path among the different available educational activities linked to that ontology. Sormo and Aamodt (2002) propose building “a cognitive model of how humans solve problems in the domain and use this model in attempting to solve the problem, both from the point of view of the current learner (using the learner model) and of an expert (represented by an expert model)”. The case-based reasoner has to evaluate the learner’s solution and to explain why s/he does or does not fit the observed features of the problem. Funk and Conlan (2003) make research more closely related to Pixed. Their goal is the same: to use learner feedback in order to adapt the learning environment. The learner feedback can be exploited in two ways: direct feedback exploitation during the learning process, in the form of learners’ comments, and feedback exploitation by authors and tutors after the learning process in order to integrate it into the proposed courses, by comparing the learners’ result with the result of other cases. The authors associate CBR with filtering techniques by attempting to create learner profiles taking into account different feedbacks. Elorriaga and Fernandez-Castro (2000) propose to use CBR to deploy an instructional planner, which adapts the sequences observed in logs in order to create instructional sequences for a complete course. In Heraud et al. (2004), a case-based reasoning system was developed to offer navigational guidance to the learner. It is based on past user’s interaction logs and it includes a model describing learning sessions.

Attribute-based techniques recommend items based on the matching of their attributes to the learner profile. Attributes could be weighted for their importance to learner. Adding new LA or learners to the network will not cause any problem. Attribute-based techniques are sensitive to changes in the profiles of the learners (Drachsler et al. 2008). They can always control the personalized RS by changing their profile or the relative weight of the attributes. A description of needs in their profile is mapped directly to available LA. A serious disadvantage is that an attribute-based recommendation is static and not able to learn from the network behavior. That is the reason why highly personalized recommendation cannot achieve. Attribute-based techniques work only with information that can be described in categories. Media types, like audio and video, first need to be classified to the topics in the profile of the learner. This requires category modeling and maintenance that could raise serious limitations for learning environments. In addition, the overspecialization can be a problem, especially if learners do not change their profile. Attribute-based recommendations are useful to handle the ‘cold-start’ problem because no behavior data about the learners is needed. Attribute-based techniques can directly map characteristics of learners (like learning goal, prior knowledge, and available study time) to characteristics of LA (Drachsler et al. 2008). Several applications tackle attribute-based techniques problems such as prediction and visualization. Attribute-based Ant Colony System (AACS) (Yang and Wu 2009) uses a method of finding learning objects that would be suitable for a learner based on the most frequent learning trails followed by the previous learners. The system updates the trails pheromones from different knowledge levels and different styles of learners to create a powerful and dynamic search mechanism. There are three prerequisites for achieving this:

  1. (a)

    The adaptive learning portal knows the learner’s attributes which include the learner’s knowledge level and learning style.

  2. (b)

    The learner’s attributes and learning object’s attributes which have been annotated by teacher or content providers.

  3. (c)

    Matching the relationships between learners and learning object.

3.4 Association rule mining

Association rule mining techniques (Agrawal et al. 1994) are one of the most popular ways of representing discovered knowledge and describe a close correlation between frequent items in a database. An association rule consists of an antecedent (left-hand side) and a consequent (right-hand side). The intersection between the antecedent and the consequent is empty. An:

$$\begin{aligned} X\Rightarrow Y \end{aligned}$$

type association rule expresses a close correlation between items (attribute-value) in a database (Zheng et al. 2001). Most association rule mining algorithms require the user to set at least two thresholds, one of minimum support and the other of minimum confidence. The support S of a rule is defined as the probability that an entry has of satisfying both X and Y. Confidence is defined as the probability an entry has of satisfying Y when it satisfies X. Therefore, the aim is to find all the association rules that satisfy certain minimum support and confidence restrictions, with parameters specified by the user. Therefore, the user must have a certain amount of expertise in order to find the right support and confidence settings to achieve the best rules.

Association rule mining has been applied to e-learning systems aims to intelligently recommend on-line learning activities to learners based on the actions of previous learners to improve course content navigation as well as to assist the on-line learning process (García et al. 2007).

Count the learners’ browsing records, learning path, testing grades, and finding out the connection between learning objects, association rule can calculate the learning profiles of the coming learners and perform the following tasks:

  • building recommender agents for on-line learning activities or shortcuts (Zaiane 2002),

  • automatically leading the learner’s activities and intelligently recommend on-line learning activities or shortcuts in the course web site to the learners (Lu 2004),

  • identifying attributes of performance inconsistency between various groups of learners (Minaei-Bidgoli et al. 2004),

  • discovering interesting learner’s usage information in order to provide feedback to course author (Romero et al. 2004),

  • finding out the relation among the learning materials from a large amount of material data (Yu et al. 2001),

  • finding learners’ mistakes that are often occur together (Merceron and Yacef 2004),

  • optimizing the content of an e-learning portal by determining the content of most interest to the learner (Ramli 2005),

  • deriving useful patterns to help educators and instructors evaluating and interpreting on-line course activities (Zaiane 2002), and

  • personalizing e-learning based on comprehensive usage profiles and a domain ontology (Markellou et al. 2005).

Most of the subjective approaches involve learner participation in order to express, in accordance with his or her previous knowledge, which rules are of interest. Hence, subjective measures are becoming increasingly important (Silberschatz and Tuzhilin 1996). Some suggested subjective measures (Liu et al. 2000) are:

  • Unexpectedness: Rules are interesting if they are unknown to the learner or contradict the learner’s knowledge.

  • Actionability: Rules are interesting if learners can do something with them to their advantage.

There are several specific researches about the application association rule mining and recommender systems in e-learning systems. Association rules for classification applied to e-learning (Castro et al. 2007) have been investigated in the areas of learning recommendation systems (Chu et al. 2003; Zaiane 2002). For example: learning material organization (Tsai et al. 2006), learner learning assessments (Hwang et al. 2003; Kumar 2005; Matsui and Okamoto 2003; Resende and Pires 2001), course adaptation to the learners’ behavior (Hsu et al. 2003; Markellou et al. 2005; Muñoz-Merino et al. 2015), and evaluation of educational web sites (Dos Santos and Becker 2003)

Wang (2002) develop a portfolio analysis tool based on associative material clusters and sequences among them. This knowledge allows teachers to study the dynamic browsing structure and to identify interesting or unexpected learning patterns. Minaei-Bidgoli et al. (2004) propose mining interesting contrast rules for Web-based education systems. Contrast rules help one to identify attributes characterizing patterns of performance difference between various groups of learners. Markellou et al. (2005) propose an ontology-based framework and discover association rules, using the Apriori algorithm. The role of the ontology is to determine which learning materials are more suitable for recommending to the learner. Li and Zaïane (2004) and Dascalua et al. (2015) use recommender agents for recommending online learning activities or shortcuts in a course web site based on a learner’s access history. Romero et al. (2004) propose to use grammar-based genetic programming with multi-objective optimization techniques for discovering useful association rules from learner’s usage information. Merceron and Yacef (2004) use association rule and symbolic data analysis, as well as traditional SQL queries to mining learner data captured from a web-based tutoring tool. Their goal is to find mistakes that often occur together. Freyberger et al. (2004) use association rules to determine what operation to perform on the transfer model that predicts a learner’s success.

Apriori algorithm (Agrawal et al. 1993) is a prominent algorithm for mining frequent itemsets for Boolean association rules. In Apriori algorithm, it is time-consuming that the database has been scanned for many times. Therefore, many algorithms, like the DIC algorithm (Brin et al. 1997), DHP algorithm (Park et al. 1995) and AprioriTid algorithm (Agrawal et al. 1993), etc., are proposed successively to improve the performance.

Association rule mining and frequent pattern mining were applied in Zaïane (2001) to extract useful patterns that might help teacher, educational managers, and Web masters to evaluate and understand on-line course activities. A similar approach can be found in Minaei-Bidgoli et al. (2004), where distinguish rules, defined as sets of conjunctive rules describing patterns of performance difference between groups of learners, were used. A computer-assisted approach for diagnosing learner learning problems in science courses and offer learners advice was presented in Hwang et al. (2003), based on the concept effect relationship (CER) model, a specification of the association rules technique.

Costabile et al. (2005) described a hypermedia-learning environment with a tutorial component. It is called Logiocando and targets children of the fourth level of primary school (9–10 years old). It includes a tutor module, based on if-then rules, that emulates the teacher by providing suggestions on how and what to study. In Matsui and Okamoto (2003) it can be found the description of a learning process assessment method that resorts to association rules, and the well-known ID3 DT learning method. A framework for the employ of Web usage mining to support the validation of learning site designs was defined in Dos Santos and Becker (2003), applying association and sequence techniques (Srivastava et al. 2000)

Markellou et al. (2005) presented a framework for personalized e-learning based on aggregate usage profiles, domain ontology, and a combination of Semantic Web and Web mining methods. The Apriori algorithm for association rules was applied to capture relations among URL references based on the navigational patterns of learners. A test result feedback (TRF) model that analyzes the relationships between learner learning time and the corresponding test results was introduced in Hsu et al. (2003). The objective was twofold: on the one hand, developing a tool for supporting the tutor in reorganizing the course material; on the other, a personalization of the course tailored to the individual learner needs. The approach was based on association rules mining. A rule-based mechanism for the adaptive generation of problems in Intelligent Tutoring System (ITS) in the context of Web-based programming tutors was proposed in Kumar (2005). In Hwang et al. (2003), a Web-based course recommendation system, used to provide learners with suggestions when having trouble in choosing courses, was described. The approach integrates the Apriori algorithm with graph theory.

Some of the main drawbacks of association rule algorithms are (García et al. 2007):

  • association rule mining algorithms normally discover a huge quantity of rules and do not guarantee that all the rules found are relevant,

  • the used algorithms have too many parameters for somebody non expert in data mining and

  • the obtained rules are far too many, most of them non-interesting and with low comprehensibility.

In order to provide better recommendations, and to be able to use recommender systems in more complex types of e-learning, most of the methods reviewed in this subsection would need significant extensions.

In the next section, we analyze a new approach that improves the understanding of learners, incorporating the tag information into the recommendation process. We first describe Collaborative Tagging Systems and Folksonomy in general; then we emphasize the proposed features of collaborative tagging that are attributed to their success in e-learning.

4 A survey of collaborative tagging systems and folksonomy

Over the past several years there has been much research done on recommendation technologies, which use a variety of statistical, machine learning, information retrieval, and other techniques that have significantly advanced early recommender systems, collaborative and content-based heuristics. The recommendation techniques explained in previous section have performed well in several applications, including the ones for recommending books, CDs, news articles or movies and some of these methods are used in the “industrial-strength” recommender systems, such as the ones developed at Amazon, MovieLens, and Last.fm. However, all of these methods have certain limitations. Recommender systems can be extended in several ways that include improving the understanding of users and items, incorporating the contextual information into the recommendation process, sustaining multicriteria ratings, and providing more flexible and less disturbing types of recommendations (Adomavicius and Tuzhilin 2005).

Collaborative tagging is employed as an approach, which is used for automatic analysis of user preference and recommendation. To improve recommendation quality, metadata such as content information of items has typically been used as additional knowledge. With the increasing reputation of the collaborative tagging systems, tags could be interesting and useful information to enhance algorithms for recommender systems. Collaborative tagging systems allow users to upload their resources, and to label them with arbitrary words, so-called tags (Golder and Huberman 2005). Collaborative tagging is the practice of allowing users to attach keywords or tags to content freely (Golder and Huberman 2005). Collaborative tagging is most useful when there is nobody in the “librarian” role or there is simply too much content for a single authority to classify. People tag pictures, videos, and other resources with a couple of keywords to retrieve them easily in a later stage. The following features of collaborative tagging are attributed to their success and popularity (Mathes 2004; Quintarelli 2005; Wu et al. 2006).

  • Low cognitive cost and entry barriers The simplicity of tagging allows any Web user to classify their favorite Web resources by using keywords that are not constrained by predefined vocabularies.

  • Immediate feedback and communication Tag suggestions in collaborative tagging systems provide mechanisms for users to communicate implicitly with each other through tag suggestions to describe resources on the Web.

  • Quick adaptation to changes in vocabulary The freedom provided by tagging allows fast response to changes in the use of language and the emergency of new words. Terms like Web2.0, ontologies and social network can be used readily by the users without the need to modify any pre-defined schemes.

  • Individual needs and formation of organization Tagging systems provide a convenient means for Web users to organize their favorite Web resources. Besides, as the systems develop, users are able to discover other people who are also interested in similar items.

  • Scalability Predefined vocabularies become imprecise when a domain grows. Instead, tags can reach a nearly unlimited granularity.

  • Serendipity Controlled vocabularies are designed to ease retrieval. Less popular content that resides in the so-called long-tail of the information space is hard to find. Tags enable users to discover long-tail information by browsing through the folksonomy network of items, tags, and users.

  • Inclusiveness The set of potential tags includes every user’s views, preferences, or language as well as all potential topics.

Since individual users create tags in a free form, one important problem facing tagging is to identify most appropriate tags, while eliminating noise and spam. For this purpose, Noll et al. (2009) define a set of general criteria for a good tagging system.

  • High coverage of multiple facets A good tag combination should include multiple facets of the tagged objects. The larger the number of facets the more likely a user is able to recall the tagged content.

  • High popularity If large number of people for a particular object uses a set of tags, these tags are more likely to identify the tagged content uniquely and a new user for the given object uses them more likely.

  • Least-effort The number of tags for identifying an object should be minimal, and the number of objects identified by the tag combination should be small. As a result, a user can reach any tagged objects in a small number of steps via tag browsing.

  • Uniformity (normalization) Since there is no universal ontology, different people can use different terms for the same concept. In general, we have observed two general types of divergence: those due to syntactic variance, e.g., color, colorize, colorise, colourise; and those due to synonym, e.g., learner and pupil, which are different syntactic terms that refer to the same underlying concept. These kinds of divergence are a double-edged sword. On the one hand, they introduce noises to the system; on the other hand it can increase recall.

  • Exclusion of certain types of tags For example, personally used organizational tags are less likely to be shared by different users. Thus, they should be excluded from public usage. Rather than ignoring these tags, tagging system includes a feature that auto-completes tags as they are being typed by matching the prefixes of the tags entered by the user before. This not only improves the usability of the system but also enables the convergence of tags.

The term folksonomy defines a user-generated and distributed classification system, emerging when large communities of users collectively tag resources (Wal 2005). Hotho et al. (2006a) are defined a folksonomy as follows.

Fig. 4
figure 4

Conceptual model of a collaborative tagging system (Marlow et al. 2006)

A folksonomy is a quadruple \(F:=(U;T;I;Y)\), where U, T, I are finite sets of instances of users, tags, and items and Y defines a relation, the tag assignment, between these sets, that is, \(Y\subseteq U\times T\times I\). Folksonomies became popular on the Web with social software applications such as social bookmarking, photo sharing and weblogs. A number of social tagging sites such as Delicious, Flickr, YouTube, CiteULike have become popular. Commonly cited advantages of folksonomies are their flexibility, rapid adaptability, free-for-all collaborative customization and their serendipity (Mathes 2004). People can in general use any term as a tag without exactly understanding the meaning of the terms they choose. The power of folksonomies stands in the aggregation of tagged information that one is interested in. This improves social serendipity by enabling social connections and by providing social search and navigation (Quintarelli 2005). Folksonomy shows many benefits (Peters and Stock 2007):

  • represent an authentic use of language,

  • allow multiple interpretations,

  • are cheap methods of indexing,

  • are the only way to index mass information on the Web,

  • are sources for the development of ontologies, thesauri or classification systems,

  • give the quality “control“ to the masses,

  • allow searching and—perhaps even better—browsing,

  • recognize neologisms,

  • can help to identify communities,

  • are sources for collaborative recommender systems,

  • make people sensitive to information indexing.

There are two types of folksonomies: broad and narrow folksonomies (Wal 2005). The broad folksonomy, like Delicious, has many people tagging the same object and every person can tag the object with their own tags in their own vocabulary. Thus, in theory there is a great number of tags that all refer to the same object (item), because users might independently use very distinct tags for the same content. The narrow folksonomy, which a tool like Flickr represents, provides benefit in tagging objects that are not easily searchable or have no other means of using text to describe or find the object.

4.1 A model for tagging activities

Social tagging systems allow their users to share their tags of particular resources. Each tag serves as a link to additional resources tagged in the same way by other users (Marlow et al. 2006). Certain resources may link to each other; at the same time, there may be relationships between users according to their own social interests, so the shared tags of a folksonomy come to interconnect the three groups of protagonists in social labeling systems: Users, Items, and Tags.

Many researchers (Mika 2005; Halpin et al. 2007; Ciro et al. 2007) suggested a tripartite model that represents the tagging process:

$$\begin{aligned} Tagging\,{:}\,(U,T,I) \end{aligned}$$

where U is the set of users who participate in a tagging activity, T is the set of available tags and I is the set of items being tagged. Figure 4 shows a conceptual model for social tagging system where users and items are connected through the tags they assign. In this model, users assign tags to a specific item; tags are represented as typed edges connecting users and items. Items may be connected to each other (e.g., as links between web pages) and users may be associated by a social network, or sets of affiliations (e.g., users that work for the same company).

Examination (Golder and Huberman 2005) of the collaborative tagging system, such as Delicious, has revealed a rich variety in the ways in which tags are used, regularities in user activity, tag frequencies, and great popularity in bookmarking, as well as a significant stability in the comparative proportions of tags within a given url.

  1. a.

    Tags may use to identify the topic of a resource using nouns and proper nouns (i.e. photo, album, and photographer).

  2. b.

    Tags may use to classify the type of resource (i.e. book, blog, article, review, and event).

  3. c.

    Tags may use to denote the qualities and characteristics of the item (i.e. funny, useful, and cool).

  4. d.

    A subset of tags, such as myfavourites, mymusic and myphotos reflect a notion of self-reference.

  5. e.

    Some tags are used by individuals for task organization (e.g. to read, job search, and to print).

Time is an important factor in considering collaborative tagging systems. In fact, definitions and relationships among tags could vary over time. For some users the number of tags can become stable over time, while for others, it keeps growing. There are three hypotheses about tags behavior over time (Halpin et al. 2006):

  1. a.

    Tags convergence: the tags assigned to a certain Web resource tend to stabilize and to become the majority.

  2. b.

    Tags divergence: tag-sets that don’t converge to a smaller group of more stable tags, and where the tag distribution repeatedly changes.

  3. c.

    Tags periodicity: after one group of users tag some local optimal tag-set, another group uses a divergent set but, after a period of time the new group’s set becomes the new local optimal tag-set. This process may repeat and so lead to convergence after a period of instability, or it may act like a chaotic attractor.

4.2 Tag-based recommender systems

Recommender systems in general recommend interesting or personalized information objects to users based on explicit or implicit ratings. Usually, recommender systems predict ratings of objects or suggest a list of new objects that the user hopefully will like the most. The approaches of profiling users with user-item rating matrix and keywords vectors are widely used in recommender systems. However, these approaches are used for describing two-dimensional relationships between users and items. In tag recommender systems the recommendations are, for a given user \(u\in U\) and a given resource \(r\in R\), a set \(\hat{{T}}(u,r)\subseteq T\) of tags. In many cases, \(\hat{{T}}(u,r)\) is computed by first generating a ranking on the set of tags according to some quality or relevance criterion, from which then the top n elements are selected (Janssen et al. 2007).

Personalized recommendation use to conquer the information overload problem, and collaborative filtering recommendation is one of the most successful recommendation techniques to date. However, collaborative filtering recommendation becomes less effective when users have multiple interests, because users have similar taste in one aspect may behave quite different in other aspects. Information got from social tagging websites not only tells what a user likes, but also why he or she likes it (Zhou et al. 2012). Tagging represents an action of reflection, where the tagger sums up a series of thoughts into one or more summary tags, each of which stands on its own to describe some aspect of the resource based on the tagger’s experiences and beliefs (Klašnja-Milićević et al. 2010). In the remainder of this section, we describe the proposed extension with integrating tags information to improve recommendation quality.

4.3 Extension with tags

The current recommender systems are commonly using collaborative filtering techniques, which traditionally exploit only pairs of two-dimensional data. As collaborative tagging is getting more widely used social tags as a powerful mechanism that reveals three-dimensional correlations between users-tags-items, could employ as background knowledge in recommender system.

The first adaptation lies in reducing the three-dimensional folksonomy to three two-dimensional contexts: \(< user, tag >\) and \(<item, tag>\) and \(<user, item>\). This can be done by augmenting the standard user-item matrix horizontally and vertically with user and item tags correspondingly (Marinho et al. 2012). User tags are tags that user u uses to tag items and are viewed as items in the user-item matrix. Item tags, are tags that describe an item i, by users and play the role of users in the user-item matrix (See Fig. 5). Furthermore, instead of viewing each single tag as user or item, clustering methods can be applied to the tags such that similar tags are grouped together.

Fig. 5
figure 5

Extend user-item matrixes by including user tags as items and item tags as users (Tso-Sutter et al. 2008)

A tag based recommender system must approach several challenges to be successful in a real world application (Marinho et al. 2011; Jäschke et al. 2012; Lacic et al. 2014):

  • tags should describe the annotated item,

  • items should awake the interest of the user,

  • suggested items should be interesting and relevant,

  • the suggestions should be traceable such that one easily understands why he got the items suggested,

  • the suggestions must be delivered timely without delay,

  • the suggestions must be easy to access (i.e., by allowing the user to click on them or to use tab-completion when entering tags),

  • the system must ensure that recommendations do not obstruct the normal usage of the system.

Recommending tags can serve various purposes, such as: increasing the chances of getting an item annotated, reminding a user what an item is about and consolidating the vocabulary across the users.

5 Applying tag-based recommender systems to e-learning environments

In this chapter, we investigate the suitability of tag-based recommender systems into a new context: e-learning. The innovation with respect to the e-learning system lies in their ability to support learners in their own learning path by recommending tags and learning items, and their ability to promote the learning performance of individual learners.

Using tags enables useful item organization and browsing techniques, such as “pivot browsing” (Millen et al. 2006), which provides a simple and effective method for discovering new and relevant items. Learners could benefit from writing tags in two important ways: first, tagging is proven a meta-cognitive strategy that involves learners in active learning and engages them with more effectively in the learning process. As summarized by Bonifazi et al. (2002), tags could help learners to remember better by highlighting the most significant part of a text, could encourage learners to think when they add more ideas to what they are reading, and could help learners to clarify and make sense of the learning content while they try to reshape the information. Learners’ tags could create an important trail for other learners to follow by recording their thoughts about specific tutorial resource. In addition, learners’ tags could give comprehensible recommendations about the resources. While the viewing of tags used on a webpage can give a learner some idea of its importance and its content, it falls short of supporting a learner in finding the exact point of interest within the page. The following features of collaborative tagging are generally attributed to their success in e-learning (Bateman et al. 2007; Doush 2011; Dahl and Vossen 2008):

  1. 1.

    The information provided by tags makes available insight on learner’s comprehension and activity, which is useful for both educators and administrators.

  2. 2.

    Collaborative tagging has potential to further enhance peer interactions and peer awareness centered on learning content.

  3. 3.

    Tagging, by its very nature, is a reflective practice, which can give learners an opportunity to summarize new ideas, while receiving peer support through viewing other learners’ tags/tag suggestions.

  4. 4.

    In e-learning there is a lack of the social cues that inform instructors about the understanding of new concepts by their learners. Collaborative tags, created by learners to categorize learning contents, would allow instructors to reflect at different levels on their learners’ progress. Tags could be examined at the individual level to examine the understanding of a learner (e.g. tags that are out of context could represent a misconception), while tags examined at the group level could identify the overall progress of the class. Working with instructors of online courses employing tagging would help shed light on the perceived benefits of reflection based on tags.

  5. 5.

    Tagging provides possible solutions for learners’ engagement in a number of different annotation activities-add comments, corrections, links, or shared discussion. E-learning systems currently lack sufficient support for self-organization and annotation of learning content (Bateman et al. 2007). However, walk through a university campus we can see learners engaged in a number of annotation activities. These include writing notes, creating marginalia in books, highlighting text, creating dog-ears on pages or bookmarking pages. During lectures as many as 99 % of learners take notes (Palmatier and Bennet 1974), and 94 % of learners at the post-secondary level believe that note taking is an important educational activity (Wiley 2000). In this sense, tagging is beneficial to note taking, since tags represent an aspect or cue to be used in the tagger’s recall process.

Traditionally, e-learning systems intend to provide direct customized instruction to learners by finding the mismatches between the knowledge of the expert and the actions that reflect the assimilation of that knowledge by the learner (Santos and Boticario 2008). Their main limitations are:

  1. 1.

    e-learning are specific of the domain for which they have been designed (since they have to be provided with the expert knowledge) and

  2. 2.

    it is unrealistic to think that it is possible to code in a system all the possible responses to cover the specific needs of each learner at any situation of the course.

In this sense, a dynamic support that recommends learners what to do to achieve their learning goals is desirable. In addition, such systems should have capability to find appropriate content on the Web, and capability to personalize and adjust this content based on the system’s examination of its learners and the collected tags given by the learners and domain experts.

5.1 Limitations of current folksonomy and possible solutions

Tagging systems have the potential to improve search, recommendation and personal organization while introducing new modalities of social communication. As described in this section, there has been much research done on tag-based recommendation technologies that have significantly advanced the state-of-the-art in comparison to early recommender systems utilized collaborative and content-based heuristics. Despite the rapid expansion of applications that support tagging of items, the simplicity and ease of use of tagging however, lead to problems with current folksonomy systems, which hinder the growth or affect the usefulness of the systems. We can classify the problems in some categories (Mathes 2004; Shepitsen et al. 2008; Pluzhenskaia 2006; Gordon-Murnane 2006). We consider set of limitations which can directly affect the tag-based recommendation process in e-learning environments.

  1. 1.

    Tags have little semantics and many variations. Thus, even if a tagging activity can be considered as the learner’s cognitive process, the resulting set of tags does not always correctly and consistently represent the learner’s mental model.

  2. 2.

    As an uncontrolled vocabulary that is shared across an entire system, the terms in a folksonomy have inherent ambiguity, as different learners apply terms to items in different ways. Tag ambiguity, in which a single tag has many meanings, can falsely give the impression that items are similar when they are in fact unrelated.

  3. 3.

    Tag redundancy, in which several tags have the same meaning, can obfuscate the similarity among items. Redundant tags can hinder algorithms that depend on identifying similarities between items.

  4. 4.

    The use of different word forms such as plurals and parts of speech exacerbate the problem.

There are some different approaches aiming to solve the mentioned problems. First one tries to educate learners to improve “tag literacy” (Guy and Tonkin 2006). An important condition for this way of resolving problems is to establish learner researches about folksonomies (Bar-Ilan et al. 2008; Winget 2006; Lin et al. 2006), concerning the “deep nature” of tags (Veres 2006a), discussing aspects of the folksonomy interoperability (Veres 2006b) and the “semiotic dynamics” of folksonomies in terms of tag co-occurrences (Cattuto et al. 2007). For training the learner’s selection of “good” tags, it may be useful that the system would suggest some tags (MacLaurin 2007). Tag-suggestions can operate on a syntactical level (e.g., a learner attaches “graph” and the system suggests “graphics”) or even on a relational level (e.g., a learner attaches “graphics” and the system suggests “image”, because both words do often co-occur in items’ tag clouds (Xu et al. 2006). In addition, tag-suggestion can be based on experts’ opinions, providing high quality of the resulting tags that are objective and cover multiple aspects.

These extensions leave many opportunities for future work in this area. They can improve tag-based recommendation capabilities and make collaborative tagging systems applicable to en even broader range of applications.

6 Conclusions

With the rapid development of e-learning environments, which characterize the huge information, the strong interactivity, the great coverage and no space-time limitations, personalization is becoming an important feature in e-learning systems. Personalization can be achieved by using heuristic rules, user models and recommendation techniques.

Recommender systems have attracted the attention of academics and practitioners. In this examination, we have identified 160 research papers on recommender systems, which were published between 2001 and 2015, to understand the trend of recommender systems-related to e-learning research and to provide practitioners and researchers with insight and future direction on RS in e-learning environments. The results represented in this paper have several significant implications:

  • This paper contributes to the conceptual and theoretical understanding of RS in e-learning environments.

  • It highlights the most important requirements and challenges for designing a recommender system in e-learning environments, which so far in the literature is not systematized and unified shown.

  • There is limited research about collaborative tagging in education, despite growing interests in exploring and unlocking the value of the increasing meta-data within education environment. The paper describes the opportunities this research area brings to higher education, which are two-fold. On the one hand, tagging provides possible solutions for learners’ engagement in a number of different annotation activities-add comments, corrections, links, or shared discussion and increase the amount of lessons learned. On the other hand, collaborative tags, created by learners to categorize learning contents, would allow instructors to reflect at different levels on their learners’ progress.

  • Various limitations of the current generation of folksonomy systems, which can directly affect the tag-based recommendation process in e-learning environments and possible extensions that can provide better recommendation capabilities are also considered in this paper.

  • The paper is significant because presents the majority of recommender systems research has been published in Management Information Systems journal, such as ACM, IEEE publications, as well as Computers in Human Behavior, Behavior and Information Technology. In addition to computers and information technology fields of study, recommender systems research has included various business fields, so it can be expected to see more recommender systems research published in management and business journals.

In this paper, we have presented survey of the state-of-the-art in traditional recommendation techniques, which can be suitable for e-learning process. Various limitations of these techniques are also considered. The integration of tag based recommendation approach in e-learning system represents an important ingredient of this paper. We analyzed the potential of collaborative tagging systems, including features that are attributed to their success in e-learning environments.

Our further research focuses into new directions, namely that of the evolving area of collaborative tagging systems. We have conducted preliminary evaluations of suitable recommendation techniques via collaborative tags of learners, obtaining favorable results (Klašnja-Milićević 2013). However, more experiments that are exhaustive should prepare in order to obtain founded conclusions about the benefits of our proposals. Also, we need more research on the basic psychological mechanisms that are addressed when learners use a recommender system. For instance, the effectiveness of preference-inconsistent recommendations, can be considered as a step towards recognition the psychological dynamics that specific types of recommendations create. We hope that the issues presented in this paper will advance the discussion about the next generation of technologies for improving recommendation process, which can be suitable for e-learning systems.