Keywords

1 Introduction

With the widespread application in educational big data, varieties of online tutorial approaches have been proposed to acquire knowledge and skills, such as MOOC. Most of the existing learning systems have realized resource sharing, which helps users to study by resource categories. However, users may confuse their learning goals sometimes. In that case, it will lead to an inefficient guidance. Thus, personalized recommendations [1, 2] are required for capturing users’ expectations.

Personal preferences and subject area are two factors of tutorial recommendations. Many researches [2,3,4] are existing for tutorial personalized recommendation. For example, the discussion on strategy behavior of teaching recommendation is based on the user’s cognitive characteristics, cognitive style, learning motivation, personality structure characteristics, and personality type factor theories. Sarwar et al. [1] presented a method combining users’ learning preferences and subject area aware. By establishing prediction model, LCARS [5] can analyze the relationship between personal preferences and hot topics. Unfortunately, it is difficult to make recommendations when users face to unfamiliar subject areas, even confusion. Hence, we focus on the perception of recommendation issues in different subject areas.

Note that, it brings a challenge to infer items from unfamiliar subject areas through using a user’s historical learning data. CF (collaborative filtering) can make recommendation by tracing users’ common interests. Usually, users only access a limited number of subject areas, this leads to data sparseness of users’ learning preference matrix, even cold start to CF. In this case, it is not feasible to use only CF-based methods [6], especially when dealing with problems in unfamiliar subject areas, because query users often do not have sufficient activity history in their unfamiliar subject areas. To solve this problem, we propose a potential probability generation model PS-LDA, it consists of offline modeling and online recommendation. The offline model is designed to take into account the following two factors in a unified way at the same time. In fact, one has his own learning preference which can be obtained by trace his historical learning data. Besides, popular learning courses in various subject areas also attract one’s interests. When users access a new subject area, especially unfamiliar subject areas, they are more likely to be interested in popular learning courses. Specifically, our model employs P-LDA to understand the user’s learning preferences from the user’s historical learning data. To pick the courses of subject areas aware, S-LDA utilizes subject-area-aware information from the subject areas. Next, given the query user u who visits the subject area \( s_{u} \), the online recommendation of our model calculates the ranking score of each course item v within \( s_{u} \) by automatically combining u’s learning preferences and \( s_{u} \)’s popular courses. Thus, our model contributes to tutorial personalized recommendations both in one’s own subject areas and unfamiliar subject areas.

The main contributions of our research are summarized as follows:

  1. 1)

    We propose a potential probability generation model PS-LDA. Specifically, P-LDA performs user topic modeling to obtain user learning preferences. S-LDA performs subject area topic modeling to obtain popular courses in the subject area. We also investigate the inference problem of our model.

  2. 2)

    We present a top-k method for personalized recommendations by matching the learning preferences and subject areas from the results of P-LDA and S-LDA.

  3. 3)

    We conducted experiments to evaluate the performance of our recommendation model on two real-life datasets. The results verified the effectiveness and efficiency of our model both in one’s own subject areas and unfamiliar subject areas.

The rest of the paper is organized as follows: Sect. 2 reviews the related work. Section 3 details the model PS-LDA on learning preferences and subject area perception. Section 4 introduces the top-k method for online recommendation. The experimental results are reported in Sect. 5. The paper is summarized in Sect. 6.

2 Related Work

Recommender System.

Collaborative filtering and content-based recommendation techniques are two widely applied methods for recommender systems. They can find relevant items according to the user’s personal interests. Collaborative filtering [1, 6] automatically recommends related items to users by referencing item rating information from other similar users. The content-based recommendation [7] assumes that the descriptive characteristics of an item well reflect the user’s preference for the item. Nevertheless, the data sparseness will affect CF, even cold start. It also brings limitation to content-based recommendations. Therefore, a great deal of researches [8] were proposed on the advantages of combining both these approaches. Our recommendations focus on incorporating popular courses in subject areas.

Personalized Generation Model.

Many models [9] were presented for obtaining and analyzing users’ preferences. Yu et al. [11] used the content sentiment analysis to improve the performance of recommendation algorithm based on CF. Based on the LOM (Learning Object Meta-data), Mei et al. [10] modeled user interests and educational resources for online course recommendation. Apaza et al. [9] used the LDA (Latent Dirichlet Allocation) model to extract the features of online courses. Chen et al. [6] used cluster analysis and multiple linear regression models to recommend students’ interest courses from their behavioral information such as attendance. However, it is lack of studies on the interaction between personal preferences and unfamiliar subject areas.

Our recommendation model differs from the above in the following three aspects. 1) We abstract a preference from user’s historical learning records to match unfamiliar subject areas. 2) We analyze the popular courses to obtain the hot topic. 3) We propose a course item model mixed with personal preferences and subject area aware.

3 Personalized Generation Model

In this section, we first introduce the key data structures and symbols used in this paper. Then we propose PS-LDA on learning preferences and subject area awareness for personalized recommendations.

3.1 Problem Definition

To facilitate the following demonstration, we have defined the key data structures and symbols used in this article. Table 1 lists the relevant symbols used in this article.

Table 1. Definition of symbols.

Definition 1

Course Item. Course item v refers to a specific course in an access subject area.

Definition 2

User Learning. The user learning is a triple (u, v, \( s_{v} \)), which indicates that the user u selects the course item v in the subject area \( s_{v} \).

Definition 3

User Learning Record. For each user u in dataset D, we create a user learning record \( D_{u} \), which is a set of quaternions associated with u. We denote users, course items, subject areas and labels as (u, v, \( s_{v} \), \( c_{v} \)) ∈ D, where u ∈ U, v ∈ V, \( s_{v} \) ∈ S, \( c_{v} \)\( C_{v} \). \( C_{v} \) represents the set of labels associated with the course item v. Note that, course items may contain multiple labels. For the learning record of the user activity, user u selects the course item v in \( s_{v} \). Then we have a set of quaternions, which is \( D_{uv} \) = {(u, v, \( s_{v} \), \( c_{v} \)): \( c_{v} \)\( C_{v} \)}. Obviously, \( D_{uv} \subseteq D_{u} \).

Definition 4

Topic. A topic z in the course item set V is represented by the topic model \( \phi_{z} \), \( \left\{ {P\left( {v\left| {\phi_{z} } \right.} \right):v \in \text{V}} \right\} \) or \( \{ \phi_{zv} :v \in \text{V}\} \), which is the probability distribution of the geographic items. By analogy, the learning preference topic in the user set U is represented by the label \( c_{v} \) in the user’s historical learning record, and is represented by the topic model \( \phi^{\prime}_{z} \), \( \left\{ {P\left( {c\left| {\phi^{\prime}_{z} } \right.} \right):c \in C} \right\} \) or \( \{ \phi^{\prime}_{zc} :c \in \text{C}\} \), which is the probability distribution of the user’s learning preferences. In summary, each topic z corresponds to two topic models in our work, namely \( \phi_{z} \) and \( \phi^{\prime}_{z} \).

Definition 5

User Learning Preferences. The learning preference of user u is represented by \( \theta_{u} \), where \( \theta_{u} \) is the probability distribution of the topic.

Definition 6

Popular Courses. Popular courses in subject area s are represented by \( \theta^{\prime}_{s} \), the probability distribution of topics, which can mine popular courses in subject areas.

3.2 PS-LDA

The hybrid model considers the user’s learning preferences and the influence of popular courses in a unified way. Given the querying user u and the visiting subject area s, the probability that user u chooses course item v when visiting the intersection of the subject area is sampled from the following model.

$$ P(v\left| {\theta_{u} } \right.,\,\theta^{\prime}_{su} ,\,\phi ,\,\phi^{\prime}) = \lambda_{u} P(v\left| {\theta_{u} } \right.,\,\phi ,\,\phi^{\prime}) + (1 - \lambda_{u} )P(v\left| {\theta^{\prime}_{su} } \right.,\,\phi ,\,\phi^{\prime}) $$
(1)

\( P(v\left| {\theta_{u} } \right.,\phi ,\phi^{\prime}) \) is the probability of generating the curriculum item v based on learning preferences \( \theta_{u} \) of u. And the process of generating \( P(v\left| {\theta_{u} } \right.,\phi ,\phi^{\prime}) \) is denoted as P-LDA. \( P(v\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime}) \) is the probability of generating the curriculum item v according to popular courses \( \theta^{\prime}_{s} \) in the subject area s. And the process of generating \( P(v\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime}) \) is denoted as S-LDA. \( \lambda_{u} \) is the parameter mixed weight for controlling the selection.

In order to further alleviate the problem of data sparseness, PS-LDA combines the label information of user history learning records. We redefine Eq. 1 as follows:

$$ P(v\left| {\theta_{u} } \right.,\theta^{\prime}_{su} ,\phi ,\phi^{\prime}) = \sum\nolimits_{{c \in C_{v} }} {P(v,c\left| {\theta_{u} } \right.,\theta^{\prime}_{su} ,\phi ,\phi^{\prime})} $$
(2)
$$ P(v\left| {\theta_{u} } \right.,\phi ,\phi^{\prime}) = \sum\nolimits_{{c \in C_{v} }} {P(v,c\left| {\theta_{u} } \right.,\phi ,\phi^{\prime})} $$
(3)
$$ P(v\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime}) = \sum\nolimits_{{c \in C_{v} }} {P(v,c\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime})} $$
(4)

Where \( C_{v} \) represents the set of labels associated with the course item v. In PS-LDA, users’ learning interest \( \theta_{u} \) and popular courses \( \theta^{\prime}_{s} \) are both modeled by polynomial distributions on potential topics. Each course item v is generated from a sample topic z. PS-LDA also parameterizes the distribution of labels associated with each topic z. So, z is responsible for generating course items and their labels at the same time.

$$ P(v,c\left| {\theta_{u} } \right.,\phi ,\phi^{\prime}) = \sum\limits_{z} {P(v,c\left| z \right.,\phi_{z} ,\phi^{\prime}_{z} )P(z\left| {\theta_{u} } \right.)} = \sum\limits_{z} {P(v\left| z \right.,\phi_{z} )P(c\left| z \right.,\phi^{\prime}_{z} )P(z\left| {\theta_{u} } \right.)} $$
(5)
$$ P(v,c\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime}) = \sum\limits_{z} {P(v,c\left| z \right.,\phi_{z} ,\phi^{\prime}_{z} )P(z\left| {\theta^{\prime}_{su} } \right.)} = \sum\limits_{z} {P(v\left| z \right.,\phi_{z} )P(c\left| z \right.,\phi^{\prime}_{z} )P(z\left| {\theta^{\prime}_{su} } \right.)} $$
(6)

We assume that the course items and their labels are independent of the topic. \( P(v,c\left| {\theta_{u} } \right.,\phi ,\phi^{\prime}) \) and \( P(v,c\left| {\theta^{\prime}_{su} } \right.,\phi ,\phi^{\prime}) \) are calculated according to formulas (5) and (6).

By estimating the parameters of the PS-LDA model to obtain the topics of the course items and labels, this validates our prior knowledge that course items with many users. Otherwise, we cluster similar content into the same topic with high probability. Figure 1 illustrates the generation process with a graphical model. Algorithm 1 outlines the generation process, where Beta (.) is the Beta distribution. And \( \gamma \), \( \gamma^{\prime} \) are two of the parameters.

Fig. 1.
figure 1

Graphical representation of PS-LDA

3.3 Model Inference

We use folded Gibbs sampling to obtain samples of hidden variable assignments, which helps to estimate unknown parameters \( \{ \theta ,\theta^{\prime},\phi ,\phi^{\prime},\lambda \} \) in PS-LDA. To simplify, we specify the hyperparameters \( \alpha \), \( \alpha^{\prime} \), \( \beta \), \( \beta^{\prime} \), \( \gamma \), \( \gamma^{\prime} \) with fixed values, e.g., \( \alpha = \alpha^{\prime} = 50/K \), \( \beta = \beta^{\prime} = 0.01 \), \( \gamma = \gamma^{\prime} = 0.5 \). During the sampling process, we start with the joint probability of all user profiles in the dataset. Next, using the chain rule, we obtain the posterior probability of the sampled subject of each quadruplet \( (u,v,s_{v} ,c_{v} ) \). Specifically, we use a two-step Gibbs sampling procedure.

figure a

Due to space constraints, we only show the derived Gibbs sampling formula, omitting the detailed derivation process. We sample t based on the posterior probability as show in Eq. 7 and 8:

$$ P(t_{ui} \text{ = 1}\left| {t_{\neg ui} } \right.,z,u,.)\,{ \propto }\,\frac{{n_{{uz_{ui} }}^{\neg ui} + \alpha_{{z_{ui} }} }}{{\sum\nolimits_{z} {(n_{uz}^{\neg ui} + \alpha_{z} )} }} \times \frac{{n_{{ut_{1} }}^{\neg ui} + \gamma }}{{n_{{ut_{0} }}^{\neg ui} + n_{{ut_{1} }}^{\neg ui} + \gamma + \gamma^{\prime}}} $$
(7)
$$ P(t_{ui} \text{ = 0}\left| {t_{\neg ui} } \right.,z,u,.)\,{ \propto }\,\frac{{n_{{s_{ui} z_{ui} }}^{\neg ui} + \alpha^{\prime}_{{z_{ui} }} }}{{\sum\nolimits_{z} {(n_{{s_{ui} z}}^{\neg ui} + \alpha^{\prime}_{z} )} }} \times \frac{{n_{{ut_{0} }}^{\neg ui} + \gamma^{\prime}}}{{n_{{ut_{0} }}^{\neg ui} + n_{{ut_{1} }}^{\neg ui} + \gamma + \gamma^{\prime}}} $$
(8)

Where \( n_{{ut_{1} }} \) is the number of times when t = 1 in the user profile \( D_{u} \). So is \( n_{{ut_{0} }} \) when t = 0. \( n_{uz} \) is the number of times when the topic z is sampled from a polynomial distribution specific to user. \( n_{sz} \) is the number of times when the topic z is sampled in the polynomial distribution of subject area s. The number \( n^{\neg ui} \) with a superscript \( \neg ui \) indicates that it does not include the number of current instances.

For \( t_{ui} = 1 \) and \( t_{ui} = 0 \), we sample the topic z according to the following posterior probability as show in Eq. 9 and 10:

$$ P(z_{ui} \left| {t_{ui} = 1} \right.,z_{\neg ui} ,v,c,u,.)\,{ \propto }\,\frac{{n_{{uz_{ui} }}^{\neg ui} + \alpha_{{z_{ui} }} }}{{\sum\nolimits_{z} {(n_{uz}^{\neg ui} + \alpha_{z} )} }}\frac{{n_{{z_{ui} v_{ui} }}^{\neg ui} + \beta_{{v_{ui} }} }}{{\sum\nolimits_{v} {(n_{{z_{ui} v}}^{\neg ui} + \beta_{v} )} }}\frac{{n_{{z_{ui} c_{ui} }}^{\neg ui} + \beta^{\prime}_{{c_{ui} }} }}{{\sum\nolimits_{c} {(n_{{z_{ui} c}}^{\neg ui} + \beta^{\prime}_{c} )} }} $$
(9)
$$ P(z_{ui} \left| {t_{ui} = 0} \right.,z_{\neg ui} ,v,c,u,.)\,{ \propto }\,\frac{{n_{{s_{ui} z_{ui} }}^{\neg ui} + \alpha^{\prime}_{{z_{ui} }} }}{{\sum\nolimits_{z} {(n_{{s_{ui} z}}^{\neg ui} + \alpha^{\prime}_{z} )} }}\frac{{n_{{z_{ui} v_{ui} }}^{\neg ui} + \beta_{{v_{ui} }} }}{{\sum\nolimits_{v} {(n_{{z_{ui} v}}^{\neg ui} + \beta_{v} )} }}\frac{{n_{{z_{ui} c_{ui} }}^{\neg ui} + \beta^{\prime}_{{c_{ui} }} }}{{\sum\nolimits_{c} {(n_{{z_{ui} c}}^{\neg ui} + \beta^{\prime}_{c} )} }} $$
(10)

Where \( n_{zv} \) is the number of times the topic z generates a course term v. \( n_{zc} \) is the number of times the label c is sampled from the topic z.

After a sufficient number of sampling iterations, we can estimate the parameters \( \theta ,\theta^{\prime},\phi ,\phi^{\prime} \) and \( \lambda \) as shown in Eq. 11 to 15:

$$ \hat{\theta }_{uz} = \frac{{n_{uz} + \alpha_{z} }}{{\sum\nolimits_{{z^{\prime}}} {(n_{{uz^{\prime}}} + \alpha_{{z^{\prime}}} )} }} $$
(11)
$$ \hat{\theta }^{\prime}_{sz} = \frac{{n_{sz} + \alpha^{\prime}_{z} }}{{\sum\nolimits_{{z^{\prime}}} {(n_{{sz^{\prime}}} + \alpha^{\prime}_{{z^{\prime}}} )} }} $$
(12)
$$ \hat{\phi }_{zv} = \frac{{n_{zv} + \beta_{v} }}{{\sum\nolimits_{{v^{\prime}}} {(n_{{zv^{\prime}}} + \beta_{{v^{\prime}}} )} }} $$
(13)
$$ \hat{\phi }^{\prime}_{zc} = \frac{{n_{zc} + \beta^{\prime}_{c} }}{{\sum\nolimits_{{c^{\prime}}} {(n_{{zc^{\prime}}} + \beta^{\prime}_{{c^{\prime}}} )} }} $$
(14)
$$ \hat{\lambda }_{u} = \frac{{n_{{ut_{1} }} + \gamma }}{{n_{{ut_{1} }} + n_{{ut_{0} }} + \gamma + \gamma^{\prime}}} $$
(15)

4 Top-K Online Recommendation

In our recommendation, we denote a two-parameter pair \( (u,s_{u} ) \) as query task with query user \( u \) and subject area \( s_{u} \). The result of the query is a sequential list of course items, which matches the user’s learning preferences. After we infer PS-LDA model parameters \( \theta_{u} \), \( \theta^{\prime}_{s} \), \( \phi_{z} \), \( \phi^{\prime}_{z} \), \( \lambda_{u} \) during the offline modeling phase, the online recommendation section calculates the ranking of each course item v in the query subject area \( s_{u} \) Scores.

$$ S(u,s_{u} ,v) = \sum\limits_{z} {F(s_{u} ,v,z)W(u,s_{u} ,z)} $$
(16)

\( S(u,s_{u} ,v) \) is the ranking framework in Eq. 16, which separates offline process from online process for scoring calculation. Specifically, \( F(s_{u} ,v,z) \) represents the offline score part for the course item v with respect to the subject area \( s_{u} \) in the dimension z. \( F(s_{u} ,v,z) \) is independent to query users. The weight score \( W(u,s_{u} ,z) \) is calculated in the online part to find expected weight of the query task \( (u,s_{u} ) \).

$$ W(u,s_{u} ,z) = \hat{\lambda }_{u} \hat{\theta }_{uz} + (1 - \hat{\lambda }_{u} )\hat{\theta }^{\prime}_{{s{}_{u}z}} $$
(17)
$$ F(s_{u} ,v,z) = \left\{ {\begin{array}{*{20}c} {\hat{\phi }_{zv} \sum\nolimits_{{c_{v} \in C_{v} }} {\hat{\phi }^{\prime}_{{zc_{v} }} } } & {v \in V_{{s_{u} }} } \\ 0 & {v \notin V_{{s_{u} }} } \\ \end{array} } \right. $$
(18)

The main time-consuming components of \( W(u,s_{u} ,z) \) are implemented offline. The online calculation can combine the processes shown in Eq. 17. In the process of querying, the offline score \( F(s_{u} ,v,z) \) needs to be aggregated in the K dimension by a simple weighted sum function from Eqs. 17 and 18. \( W(u,s_{u} ,z) \) is composed of two components, which are used to simulate user learning preferences and popular courses. Each component is associated with a user motivation. \( F(s_{u} ,v,z) \) concerns about similarities between the project co-occurrence information and the project content to generate recommendations.

5 Experiments

In this section, we conduct several experiments to compare the recommendation quality of our model.

5.1 Data Setting

Data Sets.

We employ the two real-life datasets to evaluate the performance of our model on the course recommendation task.

EdXFootnote 1. EdX is an online MOOC platform launched by Harvard and MIT. Users can learn the super-quality courses offered by these two famous schools on edX, covering different fields such as computer science, mathematics. EdX provides data on 290 Harvard and MIT online courses, 250 thousand certifications, 4.5 million participants, and 28 million participant hours since 2012.

GCSEFootnote 2. Google Custom Search Engine (GCSE) is designed to retrieve LinkedIn profiles with the keyword “coursera”. Overall, the dataset consists of 15,744 coursera MOOC entries for 5,668 professionals from LinkedIn.

Comparison Methods.

We compare our proposed PS-LDA with the following five recommendation methods.

  • User-Topic Model (UT) [12]: This model is similar to the classic author-topic model (AT model) which assumes that topics are generated according to user interests. The probabilistic formula of the user topic model is presented as follows, where \( \theta_{B} \) is a background for smoothing.\( P(v\left| u \right.;\Psi ) = \lambda_{B} P(v\left| {\theta_{B} } \right.) + (1 - \lambda_{B} )\sum\limits_{z} {P(z\left| {\theta_{u} } \right.)P(v\left| {\phi_{z} } \right.)} \).

  • Category-based k-Nearest Neighbors Algorithm (CKNN) [3]: CKNN projects a user’s learning history into the category space and models user’s learning preference using a weighted category hierarchy. When receiving a query, CKNN retrieves all the users and course items belong to the querying subject area. Then it applies a user-based CF method to predict the querying user rating of an unvisited course item. Note that the similarity between two users in CKNN is computed according to their weights in the category hierarchy, making CKNN a hybrid recommendation method.

  • Item-based k-Nearest Neighbors Algorithm (IKNN) [13]: This method utilizes the user’s learning history to create a user-course item matrix. When receiving a query, IKNN retrieves all users to find k nearest neighbors by computing the Cosine similarity between two users’ course item vectors. Finally, the course items in the user-specific querying subject area that have a relatively high ranking score will be recommended.

  • Learning Preference LDA (P-LDA): As a component of the proposed PS-LDA model, P-LDA means our method without exploiting the subject area information of course items. For online recommendation, the ranking score is computed by Eq. 16 with \( F(s_{u} ,v,z) = \hat{\phi }_{zv} \sum\nolimits_{{c_{v} \in C_{v} }} {\hat{\phi }^{\prime}_{{zc_{v} }} } \) and \( W(u,s_{u} ,z) = \hat{\theta }_{uz} \).

  • Subject Area Aware LDA (S-LDA): As another component of the PS-LDA model, S-LDA means our method without considering the content information of course items. For online recommendation, the ranking score is computed by Eq. 16 with \( F(s_{u} ,v,z) = \hat{\phi }_{zv} \) and \( W(u,s_{u} ,z) = \hat{\lambda }_{u} \hat{\theta }_{uz} + (1 - \hat{\lambda }_{u} )\hat{\theta }^{\prime}_{{s_{u} z}} \).

5.2 Evaluation Methods and Indicators

To make an overall evaluation of the recommendation effectiveness of our proposed PS-LDA, we first design the following two real settings: 1) querying subject areas are new areas to querying users; 2) querying subject areas are familiar to querying users. We divide a user’s learning history into a test set and a training set. And we adopt two different dividing strategies with respect to the two settings. For the first setting, we select all course items visited by the user in an unfamiliar subject area as the test set. The rest of the user’s learning history is used as the training set. For the second setting, we randomly select 20% of course items visited by the user in familiar subject area as the test set. The rest of personal learning history is used as the training set. We split the user learning history \( D_{u} \) into the training data set \( D_{training} \) and the test set \( D_{test} \). To evaluate the recommender models, we adopt the testing methodology and the measurement Recall @k for each test case (u, v, \( s_{v} \)) in \( D_{test} \).

  1. 1.

    We randomly select 1000 additional course in \( s_{v} \) and unrated by user u. We assume that most of them will not be of interest to user u.

  2. 2.

    We compute the ranking score for the test item v as well as the additional 1000 course items.

  3. 3.

    We form a ranked list by ordering all the 1001 course items according to their ranking scores. Let p denote the rank of the test item v within this list. The best result corresponds to the case where v precedes all the random items (i.e., p = 0).

  4. 4.

    We form a top-k recommendation list by picking the top-k ranked items from the list. If p < k, we have a hit (i.e., the test item v is recommended to the user). Otherwise, we have a miss. The probability of a hit increases with the increasing value of k. When k = 1001, we always have a hit.

The computation of Recall @k proceeds as follows. We set hit @k = 1 for a single test case if the test course item v appears in the top-k results. If not, hit @k will be set with 0. The overall Recall @k are defined by averaging all test cases.

$$ Recall@k = \frac{\# hit@k}{{\left| {D_{test} } \right|}} $$
(19)

Where #hit @k denotes the number of hits in the test set, and \( \left| {D_{test} } \right| \) are all test cases.

5.3 Experimental Results

Overall Performance.

We first present the optimal performance with well-tuned parameters. And we also study the impact of model parameters. Figure 2 reports the performance of the recommendation algorithms on EdX. We show the performance where k is in the range from 1 to 20 since a greater value of k is usually ignored for a typical top-k recommendation task. It is apparent that the algorithms have significant performance disparity in terms of top-k recall. As shown in Fig. 2(a) where querying subject areas are new areas, the recall of PS-LDA is about 0.34 when k = 10 and 0.42 when k = 20 (i.e., the model has a probability of 34% of placing an appealing event within the querying subject area in the top-10 and 42% of placing it in the top-20). Clearly, our proposed PS-LDA model outperforms other competitor recommendation methods. First, IKNN, CKNN and UT drop behind three other model-based methods, showing the advantage of using latent topic models to model users’ preferences. Second, PS-LDA outperforms both P-LDA and S-LDA, showing the advantages of combining learning preferences and subject area in a unified manner.

Fig. 2.
figure 2

Top-k performance on EdX

In Fig. 2(b), we report the performance of all recommendation algorithms for the second setting where querying subject areas are familiar to querying users. We can see that the trend of comparison result is similar to that presented in Fig. 2(a). The main difference is that CKNN outperforms IKNN in Fig. 2(a) while IKNN exceeds CKNN significantly in Fig. 2(b). It shows that the CF-based method (i.e., IKNN) better suits the setting if the user-item matrix is not very sparse. The hybrid method (i.e., CKNN) is more capable of overcoming the difficulty of data sparseness, e.g., the new subject area problem. Another observation is that UT almost performs as well as PS-LDA, and outperforms CKNN and IKNN in the familiar subject area setting, verifying the benefit brought with the subject area influence. However, UT is still less effective than PS-LDA under this setting. Furthermore, the performance of UT is poor in the new subject area setting, as shown in Fig. 2(a), which shows that exploiting subject area influence cannot alleviate the new subject area problem since there is no learning history of the querying user in the new subject area.

Figure 3 reports the performance of the recommendation algorithms on the GCSE dataset. We compare PS-LDA with UT, CKNN, IKNN, P-LDA and S-LDA. From the figure, we can see that the trend of comparison result is similar to that presented in Fig. 2, and PS-LDA performs best.

Fig. 3.
figure 3

Top-k performance on GCSE

Impact of Model Parameters.

Tuning model parameters, such as the number of topics for all topic models, is critical to the performance of models. We therefore also study the impact of model parameters on EdX dataset. Because of space limitations, we only show the experimental results for the new subject area setting. As for the hyperparameters \( \alpha \), \( \alpha^{\prime} \), \( \beta \), \( \beta^{\prime} \), \( \gamma \) and \( \gamma^{\prime} \), following the existing works [2], we empirically set fixed values (i.e., \( \alpha \) = \( \alpha^{\prime} \) = 50/K, \( \beta \) = \( \beta^{\prime} \) = 0.01, \( \gamma \) = \( \gamma^{\prime} \) = 0.5). We tried different setups and found that the estimated topic models are not sensitive to the hyperparameters. But the performances of the topic models are slightly sensitive to the number of topics. Thus, we tested the performance of P-LDA, S-LDA and PS-LDA models by varying the number of topics shown in Figs. 4(a) to 4(c). From the results, we observe 1) the Recall @k values of all latent topic-based recommender models slightly increase with the increasing number of topics. 2) The performance of latent topic-based recommender models does not change significantly when the number of topics is larger than 150. 3) P-LDA, S-LDA and PS-LDA perform better under any number of topics, and PS-LDA consistently performs best.

Fig. 4.
figure 4

Impact of the number of latent topics

6 Conclusion

This paper proposed a personalized recommendation, PS-LDA, which can facilitate people’s study not only in their familiar subject area but also in a new area where they have no learning history. By taking advantage of both the content and subject area information of course items, our system overcomes the data sparsity problem in the original user-item matrix. We evaluated our system using extensive experiments based on two real-life datasets. According to the experimental results, our approach significantly outperforms existing recommendation methods in effectiveness. The results also justify each component proposed in our system, such as taking learning preferences and subject area information into account.