1 Introduction

A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as social tagging, social classification, social indexing and collaborative tagging (Trant 2009). Social tagging is widely used in various web sites to collect, retrieve and share information. For example, the CiteULike (http://www.citeulike.org/) uses tags for sharing bibliographic references, the Delicious (https://delicious.com/) uses tags for social bookmarking, the Last.fm (http://www.last.fm/) uses tags for sharing music listening habits, and the MovieLens (http://movielens.org/) uses tags for helping users to find the right movies.

These folksonomies allow users to annotate resources with their own tags, and tagging allows users to classify and find information collectively. Especially, for multimedia resources like music, photos or videos, tagging resources is the only feasible way to organize multimedia data and to make it searchable. These tags can be freely chosen by a user and are not restricted to any taxonomy (Krestel and Fankhauser 2012). Many existing studies have investigated a variety of co-occurrence patterns between entities from a folksonomy system. The unsupervised tagging results in some benefits like flexibility, quick adaption and easy usability, but also presents some challenges; for example, the wide variety of tags assigned by users can be redundant, ambiguous or entirely idiosyncratic (Hu et al. 2012).

Tag recommendation can deal with these challenges by suggesting tags that users are most likely to use for a resource. Recommending tags can serve various purposes, such as: increasing the chances of getting a resource annotated, reminding a user what a resource about, and consolidating the vocabulary across the users (Marinho et al. 2011). Furthermore, as Sood et al. (2007) pointed out that, tag recommendations “fundamentally change the tagging process from generation to recognition”, which requires less cognitive effort and time. So more researchers and Internet enterprises pay highly attentions to tag recommendation. Recently, scholars have put forward various tag recommendation approaches, which mainly include the collaborative filtering approaches, the graph-based approaches, the content-based approaches and the hybrid approaches. The related work is introduced in the next section.

The existing achievements seldom consider the fact that users’ tagging behavior changes with time. However, according to the statistical analysis, we find that the total number of tags used by a user changes over time in a social tagging system. In this paper, we study the tag recommendation method by considering the phenomenon that users’ tagging behavior changes with time. We first propose three types of user tagging status, namely the growing status, the mature status and the dormant status; and the determining user tagging status algorithm is also devised. After analysing the characteristics of user tagging status, we present three corresponding tag recommendation strategies by computing tag probability distribution in users’ and resources’ tag space, based on the statistical language model. Finally, the results of comparison experiments on the CiteULike dateset and the Last.fm dateset show that the proposed tag recommendation method is better at the accuracy than the comparative approaches as the FolkRank (Kim and El Saddik 2011), the LocalRank (Kubatz et al. 2011) and the most popular tags ρ-mix (Jäschke et al. 2008).

The remainder of the paper is structured as follows. Section 2 briefly reviews the related work. Then we introduce some basic concepts in Section 3. In Section 4, we formalize the concept of user tagging status, and present the determining user tagging status algorithm. Section 5 brings the further tag recommendation method based on different user tagging status. The comparative experiment analysis on two social tagging systems are described in Section 6. Finally, some conclusions and discussions are given in Section 7.

2 Related work

In recent years, scholars have put forward various tag recommendation approaches. Generally speaking, these approaches could be divided into four categories, namely the collaborative filtering approaches, the graph-based approaches, the content-based approaches and the hybrid approaches.

Collaborative filtering is a common technique used by recommender systems. The traditional collaborative filtering methods cannot be applied directly, unless we reduce the ternary relation to a lower dimensional space, because there exists the ternary relationships among users, resources and tags in a social tagging system. Lu et al. (2011) developed a post-based collaborative filtering framework to recommend tags based on the query user’s tagging history and tags that have been associated with the query document, leveraging the ternary relationships. Liu et al. (2011) injected the social relations between users and the content similarities between resources into a graph representation of folksonomies, exploited random-walk computation of similarities, and combined both the collaborative information and the tag preferences to recommend tags. Wang et al. (2013) put forward a novel hierarchical Bayesian model, which can seamlessly integrate the item-tag matrix, item content information and social networks between items into the same principled model based on extending the collaborative filtering approaches. Ma et al. (2015) proposed a recommendation approach fusing user-generated tags and social relations into a novel way, in order to solve the data sparsity problem and improve the recommendation accuracy.

The basic idea of graph-based approaches is to construct a graph with users, resources and tags as vertices and build edges according to user’s tagging behavior (Liu et al. 2010). The kind of method dose not need consider the content of resources and semantic information of tags. Kim and El Saddik (2011) introduced a new way to compute the probabilistic interpretation in FolkRank by representing it as a linear combination of the personalized PageRank vectors. However, one of the major disadvantage of FolkRank is the steep computational costs. In contrast to the previous graph-based algorithms, Kubatz et al. (2011) computed the rank weights of tags only based on the tag space of a given user and resource. Ramezani (2011) suggested to improve the existing graph-based tag recommendation techniques by introducing a new model of the folksonomy as a directed graph. Rawashdeh et al. (2013) proposed to adapt the Katz measure in social tagging systems from a graph-based perspective. Cai et al. (2016) proposed the GRETA, a novel graph-based approach to assign tags for repositories on GitHub, based on constructing an Entity-Tag Graph (ETG) for GitHub using the domain knowledge from StackOverflow, and assign tags for repositories by taking a random walk algorithm. Hmimida and Kanawati (2016) proposed a graph-coarsening approach where a community detection algorithm is applied in the diversiform networks to speed up the execution time of graph-based tag recommenders in large-scale folksonomies.

The content-based approaches usually employ content of resources and adopt machine learning technology to recommend tags. Krestel and Fankhauser (2012) thoroughly investigated the use of language models for tag recommendation, showing that simple language models built from users and resources yield competitive performance while consuming only a fraction of the computational costs compared to more sophisticated algorithms. By modeling the generating process of social tagging systems in a Latent Dirichlet Allocation approach, Zhang et al. (2012) built a fully generative model for social tagging, leveraged it to estimate the relation among users, tags and resources in order to achieve the tag recommendation tasks. To learn the weights of different types of nodes and edges represented by features, Feng and Wang (2012) proposed an optimization framework, which learnt the best feature weights by maximizing the average area under the Curve of the tag recommender. Wu et al. (2016) proposed a generative model, where they can generate the words based on the tag-word distribution as well as the tag itself. Xie et al. (2016) proposed a novel generic model SenticRank to incorporate various sentiment information to various sentiment-based information for personalized recommendation by user profiles and other information.

Generally speaking, the hybrid approaches combine two or more than two kinds of tag recommendation algorithms. Gemmell et al. (2010) proposed a weighted linear hybrid incorporating simple popularity and collaborative filtering components, and the success of the hybrid over the lower-dimensional components demonstrates clearly the importance of an integrative approach that exploits multiple dimensions of the data. Belém et al. (2014) had proposed a personalized and object-centered tag recommendation methods for Web 2.0 applications. Kim and Kim (2014) investigated association rule, bigram, tag expansion, and implicit trust relationship for providing tag and item recommendations on a social tagging recommendation system. Wei et al. (2016) proposed a hybrid movie recommendation approach based on the user’s annotating information to improve the ability of fusion and give the personalized recommendation services.

Furthermore, in the past few years, we have witnessed great advances in many perception tasks by using deep learning models. Wang and Yeung (2016) proposed a general framework for Bayesian deep learning and discussed the applications of deep learning on recommender systems, topic models and control. Wang et al. (2015) proposed a hierarchical Bayesian model called collaborative deep learning, which jointly performs deep representation learning for the content information and collaborative filtering for the ratings matrix.

However, the existing achievements seldom consider users’ tagging behavior changes with time. In fact, the tagging behavior varies during different time. For example, the user might tag resources frequently during a period, the user might tag resources occasionally during a period, or the user might seldom tag resources during a period. Thus, this paper studies the tag recommendation method by considering the fact that users’ tagging behavior changes with time.

3 Basic concepts

3.1 Social tagging system

Folksonomy (Vander Wal 2007), a term coined by Thomas Vander Wal in 2004, is the basic data structure of the social tagging system.

Formally, a folksonomy is a quadruple \( \mathbb {F}=(U,R,Tag,Y)\), where U = {u 1,⋯ , u k ,⋯ ,u K }, R = {r 1,⋯ ,r l ,⋯ ,r L } and T a g = {t a g 1,⋯ ,t a g m ,⋯ ,t a g M } are finite sets, whose elements are called users, resources and tags, respectively; K, L, and M are the numbers of users, resources, and tags, respectively. Y is a ternary relation among them, that is YU × R × T a g.

The ternary relation Y can be transferred to three binary relations, and each binary relation can be described by a matrix. That is, matrices UTag K×M , RTag L×M , and UR K×L represent the user-tag, the resource-tag and the resource-tag relations, respectively. Set the element of UTag K×M be \(w_{u_{k}tag_{m}}\), where \(w_{u_{k}tag_{m}}\) represents the number of resources which are labeled as the tag t a g m by the user u k . Set the element of RTag L×M be \(w_{r_{l}tag_{m}}\), where \(w_{r_{l}tag_{m}}\) represents the number of users who use the tag t a g m to label the resource r l . Set the element of UR K×L be \(w_{u_{k}R_{l}}\), where \(w_{u_{k}R_{l}}\) represents the number of tags which are labeled by the user u k to the resource r l .

Let \(Tag_{u_{k}}\) be the set of tags used by the user u k , and \(Tag_{r_{l}}\) be the set of tags assigned to the resource r l . Each post a of the folksonomy consists of three parts: a user u k , a resource r l and all tags in T a g(u k ,r l ). That is, a = (u k ,r l ,T a g(u k ,r l )). T a g(u k ,r l ) is a set of tags that the user u k has assigned to the resource r l . All posts of the social tagging system constitute the post set A.

For a given user u q U and a given resource r q R with \(Tag(u_{q},r_{q})\neq \varnothing \), the task of a tag recommendation is to recommend a set of tags, \(\widehat {Tag}(u_{q},r_{q})\), with a tag recommendation algorithm, where \(\widehat {Tag}(u_{q},r_{q}) \subseteq Tag\). In many cases, \(\widehat {Tag}(u_{q},r_{q})\) is computed by generating a ranking on the set of tags according to some quality or relevance criterion, from which then the top n elements are selected and recorded in \(\widehat {Tag}^{n}(u_{q},r_{q})\).

3.2 Statistical language model

Statistical language model (Ponte and Croft 1998) (abbreviated as SLM) is widely used in natural language processing fields, such as speech recognition, information retrieval and machine translation. Essentially, it is a probability distribution model, mainly describes the inherent laws of statistics and structure of natural language. The set of all strings is a language, and a language model is called the probability distribution model of strings in the language.

In the field of information retrieval, the basic idea of statistical language model is to explain the correlation between a query q and a document d to produce a probability model of query from the document, i.e. \(p_{LM}(q|d)={\prod }_{w\in q}p(w|d)\), where w is a word of the query, and p(w|d) is the probability of querying the word w from the document d, which is calculated as follows:

$$\begin{array}{@{}rcl@{}} p(w|d)&=&\frac {N_{d}}{N_{d} + \lambda} \times \frac{tf(w,d)} {N_{d}} \\ &&+ \left( 1 - \frac {N_{d}}{N_{d}+\lambda} \right) \times \frac{tf(w,D)}{N_{D}} , \end{array} $$
(1)

where N d is the length of the document d with the word as the unit, t f(w,d) is the word frequency of w in the document d, N D is the total number of words in all the documents, t f(w,D) is the word frequency of w in all the documents, λ is a Dirichlet smoothing factor whose value is set to be the average document length in the document set, i.e. λ = N d /N D .

4 User tagging status

4.1 Related definitions

Let us observe the change of total numbers of tags that the user owned during a period of time T. Let the start moment be T 0, and we take equal interval as observation points (in the following experiments, the period of a month is chosen as a unit of time), then the next moment is T 1. Suppose the current moment to be T t .

The set \(Tag^{u_{k}T_{t}}\) consists of different tags used by the user u k in a unit time interval, i.e. [T t−1,T t ). The \(f_{u_{k}}(T_{t})\) indicates the number of tags used by the user u k in [T t−1,T t ), that is:

$$ f_{u_{k}}(T_{t})=|Tag^{u_{k}T_{t}}|. $$
(2)

The \(g_{u_{k}}(T_{t})\) is the number of tags used by the user u k in the time interval [T 0,T t ), that is:

$$ g_{u_{k}}(T_{t})=|\bigcup \limits_{\tau =0}^{t}Tag^{u_{k}T_{\tau}}|. $$
(3)

For the user u k , the tags used before the moment T t−1 are called the historical tags of the user at the moment T t . Obviously, the number of historical tags is \(g_{u_{k}}(T_{t})=|\bigcup \limits _{\tau =0}^{t-1}Tag^{u_{k}T_{\tau }}|\). The tags, which have not used before the moment T t−1 but used in [T t−1,T t ), are called the new tags for the user. The number of new tags is \(g_{u_{k}}(T_{t})=|\bigcup \limits _{\tau =0}^{t}Tag^{u_{k}T_{\tau }} \backslash \bigcup \limits _{\tau =0}^{t-1}Tag^{u_{k}T_{\tau }}|\).

Let us observe what will happen during the time period [T t−1,T t ), the user u k may tag or not. Thus, we should discuss in two cases:

Case 1: the user u k has no tagging behavior, i.e. \(f_{u_{k}}(T_{t})=0\).

Case 2: the user u k has tagging behavior, i.e. \(f_{u_{k}}(T_{t})\neq 0\).

In the Case 2, we need to consider three aspects:

The u k uses both new tags and historical tags, i.e. \(0< \frac {g_{u_{k}}(T_{t})-g_{u_{k}}(T_{t-1})}{f_{u_{k}}(T_{t})}<1\).

The u k uses only new tags, i.e. \(\frac {g_{u_{k}}(T_{t})-g_{u_{k}}(T_{t-1})}{f_{u_{k}}(T_{t})}=1\).

The u k uses only historical tags, i.e. \(g_{u_{k}}(T_{t})-g_{u_{k}}(T_{t-1}) =0\).

During a period of time, when the total of tags which the user owns (i.e. the total of different tags used by a user to tag resources) grows slowly or rapidly, it is certainly that the user uses new tags and it is possible that the user uses historical tags. When the total of tags which the user owns remains unchanged, the user only uses historical tags or the user has no tagging behavior. In other words, the user’s tagging status have three cases during a period of time: the first case is the scenario that a user’s total number of tags increases rapidly; the second case is the scenario that a user’s total number of tags increases slowly; the third case is the scenario that a user has no tagging behavior. Therefore, we defined the three cases as users’ tagging status as the growing status, the mature status and dormant status respectively.

In the period of time T = [T t−Δt ,T t ), if the total number of tags which a user owns is increased and the average growth rate of the total number of tags is no less than the threshold α, we call that the user is in the growing status. That means, during the period of time, the user is quite active, and adds many new tags into the social tagging system.

Definition 1

(Growing Status) Considering the period of time [T t−Δt ,T t ), for a user u k , if \(\frac {g_{u_{k}}(T_{t-1})-g_{u_{k}}(T_{t-{\Delta } t})}{\Delta t}\geq \alpha \), then the user tagging status of the user u k at the time T t is the growing status.

In the period of time T = [T t−Δt ,T t ), if the total number of tags a user owned is increased and the average growth rate of the total number of tags is less than the threshold α, we call that the user is in the mature status. That means, during the period of time, the user adds a few new tags into the social tagging system and also uses many historical tags.

Definition 2

(Mature Status) Considering the period of time [T t−Δt ,T t ), for a user u k , if \(\exists T_{t^{\prime }} \in [T_{t-{\Delta } t},T_{t})\) brings \(f_{u_{k}}(T_{t^{\prime }})\neq 0\) and \(0\leq \frac {g_{u_{k}}(T_{t-1})-g_{u_{k}}(T_{t-{\Delta } t})}{\Delta t}< \alpha \), then the user tagging status of user u k at the time T t is the mature status.

In the period of time T = [T t−Δt ,T t ), if a user has no tagging behavior and the total number of tags the user remains constant, we call that the user is in the dormant status.

Definition 3

(Dormant Status) Considering the period of time [T t−Δt ,T t ), for a user u k , if \(\forall T_{t^{\prime }} \in [T_{t-{\Delta } t},T_{t})\) brings \(f_{u_{k}}(T_{t^{\prime }})=0\), then the user tagging status of user u k at the time T t is the dormant status.

4.2 Determining user tagging status algorithm

Suppose the current time is T t . We can determine the user tagging status at the moment T t according to Definition 1, Definition 2 and Definition 3, by analysing the tagging history of the user u k during the period of time Δt.

Then, the determining user tagging status algorithm, abbreviated as DUTS, is described in Algorithm 1. Here, the T 0 is the moment when the user begins to use the social tagging system. If the duration that user u k uses the social tagging system is less than Δt, and has tagging behavior recently, we think the user is in the growing status. Because everyone is personalized, the duration that a user in different tagging status is different. In order to simplify the calculation, the determining user tagging status algorithm only backs the user’s tagging history to a fixed period of time.

figure a

5 Tag recommendation method based on user tagging status

The Fig. 1 shows the framework of tag recommending model proposed in this paper. Once the user’s user tagging status is determined, we can employ different strategy to recommend tags for the user. Algorithm 2 describes the processing of the tag recommendation algorithm based on user tagging status, abbreviated as TR-UTS. First, the algorithm computes the user tagging status of the user at the moment T t by using Algorithm 1. Then, the algorithm determines tag recommendation strategy by calculating tag probability distributions according to the user’s tagging status. Finally, the top n tags, most likely to be used by the user, are recommended.

Fig. 1
figure 1

The framework of tag recommendation model based on user tagging status

figure b

A few additional explanations to the proposed method. Question 1: how to obtain the user u q ’s group members. It had been proved to be very helpful to improve the accuracy of recommendation by utilizing the group information. Since it is not the point of this paper, we simply think the friendship existing among the folksonomy as the user’s group information. For example, each user has an average of 13.443 friends in the Last.fm; for the CiteULike, there also exists “group id | username” information. Anyway, to propose an appropriate clustering method on users should be our further work in order to further enhance the flexibility of the proposed method. Question 2: how to obtain the target resource r q ’s similar resources. For this question, we will give the detail description in the Section 5.1.

5.1 The strategy for user tagging status in growing status

Considering a given user u q to tag a given resource r q at the current time, the user’s user tagging status is the growing status, which means the number of resources tagged by the user is increasing continually during a period of time before the current time, and the total number of tags used by the user is increasing continually, too. Therefore, it is helpful to enhance the performance of recommendation by considering the following two kinds tags: (1) one kind of tags is the tags used by the target user and his/her group members; and (2) the other kind of tags is the tags to label the target resource and their similar resources. Then, we can compute those tags’ probability distribution with SLM to recommending tags.This approach not only ensures recommending personalized tags, but also increases the diversity of recommended tags.

The strategy for the user tagging status in growing status, abbreviated as TR-GS, is described as follows.

Step1: according to the resource-tag matrix RTag L×M , to calculate the similarity between the resource r q and other resources based on the cosine similarity, and select the top S resources with highest similarity to r q as the neighbor set of the resource r q .

Set a row of the resource-tag matrix RTag L×M be the vector r. Then, the similarity s i m(r l ,r q ) between r l and r q is computed as follows:

$$ sim(\textbf{r}_{l},\textbf{r}_{q})= \frac{\textbf{r}_{l} \cdot \textbf{r}_{q}}{\parallel \textbf{r}_{l} \parallel \parallel \textbf{r}_{q} \parallel}. $$
(4)

Step2: considering all tags labeled for the resource r q and its neighbors, to compute the tag probability distribution p(t a g m r q ) according to the following equation:

$$\begin{array}{@{}rcl@{}} p(tag_{m}\mid r_{q})&=&\frac{N_{Tag_{r_{q}}}}{N_{Tag_{r_{q}}} + \lambda_{r_{q}}} \times \frac {TF(tag_{m},Tag_{r_{q}})}{N_{Tag_{r_{q}}}} \\ &&+ \left( 1- \frac{N_{Tag_{r_{q}}}}{N_{Tag_{r_{q}}} + \lambda_{r_{q}}}\right) \times \frac {TF(tag_{m},TagS)}{N_{TagS}}, \end{array} $$
(5)

where \(TF(tag_{m},Tag_{r_{q}})\) is the number of users who use the t a g m to label the r q , namely, \(TF(tag_{m},Tag_{r_{q}})=w_{r_{q}tag_{m}}\). \(N_{Tag_{r_{q}}}\) is the sum of weights of tags of resource r q . T a g S is the set of tags labeled to the resource r q and its neighbors, and for ∀t a gT a g S, its tag weight \(w^{\prime }_{r_{k}tag_{m}}=w_{r_{k}tag_{m}} \times sim(\textbf {r}_{k},\textbf {r}_{q})\). N T a g S is the sum of weights of tags in the set T a g S. T F(t a g m ,T a g S) is the sum of weights of the tag t a g m labeled to the resource r q and its neighbors. \(\lambda _{r_{q}}\) is interpreted as a Dirichlet smoothing factor, i.e. \(\lambda _{r_{q}}=N_{Tag_{r_{q}}} / N_{TagS}\).

Step3: considering all the tags used by the user u q and his/her group members, based on the user-tag matrix UTag K×M , to compute the tag probability distribution p(t a g m u q ) according to the following equation:

$$\begin{array}{@{}rcl@{}} p(tag_{m}\mid r_{q})&=&\frac{N_{Tag_{u_{q}}}}{N_{Tag_{u_{q}}} + \lambda_{u_{q}}} \times \frac {TF(tag_{m},Tag_{u_{q}})}{N_{Tag_{u_{q}}}} \\ &&+ \left( 1- \frac{N_{Tag_{u_{q}}}}{N_{Tag_{u_{q}}} + \lambda_{u_{q}}}\right) \times \frac {TF(tag_{m},Tag_{U_{q}})}{N_{Tag_{U_{q}}}}, \end{array} $$
(6)

where \(N_{Tag_{u_{q}}}\) is the sum of tag weights of tags user u q used. \(TF(tag_{m},Tag_{u_{q}})\) is the tag weight of t a g m the user have used, namely, \(TF(tag_{m},Tag_{u_{q}})=w_{u_{q}tag_{m}}\). The set U q consists of user u q and users in the same groups with u q . \(Tag_{U_{q}}\) is the set of tags used by user u q and users in the same groups with u q . \(N_{Tag_{u_{q}}}\) is the sum of tag weights of tags in the set \(Tag_{U_{q}}\). \(TF(tag_{m},Tag_{u_{q}})\) is the sum of tag weights of the tag t a g m used by users in the set U q . \(\lambda _{u_{q}}\) is a Dirichlet smoothing factor, i.e. \(\lambda _{u_{q}}=N_{Tag_{u_{q}}} / N_{Tag_{U_{q}}}\).

Step4: compute the possibility of the user u q use the tag t a g m to label the resource r q , p(t a g m r q ) and p(t a g m u q ), according the following equation:

$$\begin{array}{@{}rcl@{}} p(tag_{m}\mid u_{q},r_{q})& =&(1-\beta) \times p(tag_{m}\mid u_{q}) \\ &&+\beta \times p(tag_{m}\mid r_{q}). \end{array} $$
(7)

where β ∈ [0,1].

Step5: sort the tags according to the probability p(t a g m u q ,r q ), and then select the top n elements with the highest rank values to recommend to the user u q , that is,

$$\begin{array}{@{}rcl@{}} \widehat{Tag}^{n}(u_{q},r_{q})=\max \limits_{tag_{m} \in Tag}^{n} (p(tag_{m}\mid u_{q},r_{q})). \end{array} $$

5.2 User tagging status in mature status

When the given user u q tags the given resource r q , if at the moment the user’s user tagging status is the mature status, then during the period of time before the moment, the user’s tagging behavior tends to be stable, and the amount of resources achieves a certain number; thus, the total number of the user’s tags has increases slowly. We compute those tags’ tag probability distribution with SLM based on the user’s tags and the resource’s tags. This approach not only ensures the accuracy of tag recommendation, but also reduces the computation complexity.

The strategy for the user tagging status in mature status, abbreviated as TR-MS, is described as follows.

Step1: for \(\forall tag_{m} \in Tag_{u_{q}}\), the probability p u (t a g m u q ) that the user u q will use t a g m is calculated as follows:

$$ p_{u}(tag_{m}\mid u_{q})=\frac{w_{u_{q}tag_{m}}}{N_{Tag_{u_{q}}}}, $$
(8)

where, \(N_{Tag_{u_{q}}}\) is the sum of tag weights of tags used by u q , namely, \(N_{Tag_{u_{q}}}=\sum \limits _{tag \in Tag_{u_{q}}} w_{u_{q}tag}\).

Step2: for \(\forall tag_{m} \in Tag_{r_{q}}\), the probability p r (t a g m r q ) that the resource r q will be labeled by t a g m is calculated as follows:

$$ p_{r}(tag_{m}\mid r_{q})=\frac{w_{r_{q}tag_{m}}}{N_{Tag_{r_{q}}}}, $$
(9)

where, \(N_{Tag_{r_{q}}}\) is the sum of tag weights of tags labeled to r q , namely, \(N_{Tag_{r_{q}}}=\sum \limits _{tag \in Tag_{r_{q}}} w_{r_{q}tag}\).

Step3: calculate the p(t a g m u q ,r q ), the probability that a given tag t a g m will be used by the given user u q to label the given resource r q , using a weighted linear combination of p u (t a g m u q ) and p r (t a g m r q ), as follows:

$$ \begin{array}{ll} p(tag_{m}\mid u_{q},r_{q}) =&(1-\gamma) \times p_{u}(tag_{m}\mid u_{q}) \\ &+\gamma \times p_{r}(tag_{m}\mid r_{q}), \end{array} $$
(10)

where, t a g m ∈ (T a g u T a g r ), and γ ∈ [0,1].

Step4: sort the tags according to the probability p(t a g m u q ,r q ), and select the top n elements to recommend to the user u q , that is:

$$\begin{array}{@{}rcl@{}} \widehat{Tag}^{n}(u_{q},r_{q})=\max \limits_{tag_{m} \in Tag}^{n} (p(tag_{m}\mid u_{q},r_{q})). \end{array} $$

5.3 User tagging status in dormant status

When the given user u q tags the given resource r q , if at the moment the user’s user tagging status is the dormant status, then during the period of time before the moment, this user did not tag. Thus, we compute the tag probability distribution with SLM using tags labeled to the resource r q and its similar resources.

The strategy for the user tagging status in dormant status, abbreviated as TR-DS, is described as follows.

Step1: estimate the probability p(t a g m r q ) that the tag t a g m will be labeled to the resource r q using the same method used in the Section 5.1.

Step2: sort the tags according to the probability p(t a g m r q ), and then select the top n elements with the highest rank values to recommend to the user u q , that is:

$$\begin{array}{@{}rcl@{}} \widehat{Tag}^{n}(u_{q},r_{q})=\max \limits_{tag_{m} \in Tag}^{n} (p(tag_{m}\mid r_{q})). \end{array} $$

6 Experiments

In this session, we conducted various experiments to evaluate and analyze the effectiveness and efficiency of the proposed method on some datasets. In the first set of runs, we gave examples with purposes of assessing the effectiveness of the DUTS algorithm. In the second set of runs, we obtained experimentally the threshold values used in the TR-UTS algorithm and the most popular tags algorithm (Jäschke et al. 2008). In the third set of runs, we gave some results of TR-UTS, TR-GS, TR-MS and TR-DS. In the forth set of runs, we compared the proposed TR-UTS algorithm to other algorithm as FolkRank (Kim and El Saddik 2011), LocalRank (Kubatz et al. 2011) and the most popular tags ρ-mix (Jäschke et al. 2008). But before reporting these experimental results, we need to introduce the dataset preprocessing and the evaluation criteria that we adopt.

6.1 Dataset preprocessing

There are two datasets of social tagging systems used in experiments, that is, the CiteULikeFootnote 1, and the Last.fmFootnote 2. CiteULike is a web service which allows users to save and share citations to academic papers. Users can organize their libraries with freely chosen tags and this produces a folksonomy of academic interests. Last.fm is a music website, the site offers numerous social networking features and can recommend and play artists similar to the user’s favourites. Though there is no palpable information for the user is belong to which group, there exists friendships between users in these two folksonomies. Thus, we can find the friends from the original data and set the user and his (her) friends into a group.

The original datasets are too sparse to be used for experiments. Therefore, the p-core of level k algorithm (Batagelj and Zaveršnik 2011) is applied to the datasets so that every user, every resource and every tag appear at least k times in the processed datasets. The statistical information after preprocessing are shown in Table 1. The first column denotes some statistical information of the corresponding dataset, the second column describes the statistical information on CiteULike when k=30, and the third column presents the statistical information on Last.fm when k=10.

Table 1 Some information of datasets after preprocessing

In the dataset of CiteULike, the data between 2004/11/04 and 2012/09/30 are chosen as the train set, the data between 2012/10/01 and 2012/10/16 as the test set. In the dataset of Last.fm, we choose the data between 2005/08/01 and 2011/02/28 as the train set, and the data between 2011/03/01 and 2011/05/09 as the test set. The train sets are used to show that the statistical results are coincident with the results of the DUTS algorithm in Section 6.3.1, and used to determine parameters in Section 6.3.2; the other experiments are carried out in the test sets.

6.2 Evaluation criteria

To measure the recommendation quality, we adopt the standard evaluation criteria in the information retrieve field as the recall (R @ n), the precision (P @ n) and the F-measure (F @ n) at the Top-n (Kim and El Saddik 2011), where n is the length of recommended tags set.

Let U be the data set. For u q U, the tag post, a = (u q ,r q ,T a g(u q ,r q )), is created when the user u q annotates a resource r q with a set of tags T a g(u q ,r q ). That is, T a g(u q ,r q ) is the set of tags that user u q tagged the resource r q in the data. The \(\widehat {Tag}^{n}(u_{q},r_{q})\) is the set of top-n recommended tags for the user-resource pair.

Recall, is a common metric for evaluating the utility of recommendation algorithms and a measure of completeness. Precision is another common metric for measuring the usefulness of recommendation algorithms and a measure of exactness. For a given user-resource pair (u q ,r q ), when the size of the recommended tag set is n, R @ n measures the percentage of tags in the tag set of the corresponding post that appear in the recommended tag set. P @ n measures the percentage of tags in the recommended tag set that appear in the tag set of the corresponding post. So, R @ n and R @ n are defined as follows:

$$ R@n = \frac{|Tag(u_{q},r_{q})\cap \widehat{Tag}^{n}(u_{q},r_{q})|}{|Tag(u_{q},r_{q})|}, $$
(11)
$$ P@n = \frac{|Tag(u_{q},r_{q})\cap \widehat{Tag}^{n}(u_{q},r_{q})|}{|\widehat{Tag}^{n}(u_{q},r_{q})|}. $$
(12)

P @ n and R @ n will be influenced by the n; for example, the bigger n is, the bigger is R @ n but the smaller is P @ n. Therefore, we adopt F @ n as the measurement, which is the harmonic mean of P @ n and R @ n and defined as follows:

$$ F@n = \frac{2 \times R@n \times P@n}{R@n + P@n}. $$
(13)

The overall recall, precision and F-measure for all users are computed by averaging the individual precisions and recalls, respectively.

6.3 Experimental results

The experiment contains three parts: the first part shows examples of determining user tagging status in the train sets; the second part calculates parameters in our tag recommendation algorithm as well as ρ in the most popular tags ρ-mix algorithm in the train sets; the third part compares the result of TR-UTS with TR-GS, TR-MS and TR-DS, along with the comparison between TR-UTS and FolkRank, LocalRank, and the most popular tags ρ-mix algorithm in the test sets.

6.3.1 Results of the determining user tagging status algorithm

Firstly, we calculate user tagging status using the determining user tagging status (DUTS) algorithm proposed in Section 4.2. The period of a month is chosen as the unit of time in our experiments. For the CiteULike dataset, the data between 2005/08/04 and 2012/09/30 (a total of 86 months) are considered, the time interval is [T 0 = 1,T t = 86], and the time cell is 1 month. For the Last.fm dataset, the data between 2005/10/01 and 2011/02/28 (a total of 65 months) are considered, the time interval is [T 0 = 1,T t = 65], and the time cell is 1 month.

For the space limit, Tables 2 and 3 show examples of the results of the DUTS algorithm. That is, Table 2 shows the user tagging status calculated by the DUTS algorithm for the user ID=103 in the CiteULike during this period; Table 3 shows the corresponding results of the user ID=410 in the Last.fm dataset. The thresholds used here is determined by experience, because the different folksonomy has the different characteristic. That is, for the CiteULike dataset, Δt = 4 and α = 3; for the Last.fm dataset, Δt = 2 and α = 2.

Table 2 The results of DUTS algorithm for the user ID=103 in the CiteULike dataset
Table 3 The results of DUTS algorithm for the user ID=410 in the Last.fm Dataset

As a check, we also count the total number of tags used by users over a period of time on the datasets. Figures 2 and 3 as examples give the statistical result of the total number of tags used by the user ID=103 in CiteULike dataset and user ID =410 in Last.fm dataset respectively. The t o t a l T a g N u m curve shows the total numbers of tags used by the user from the 1st month to the current t-th month; that is, the t o t a l T a g N u m represents \(g_{u_{q}}(T_{t})\). The p e r M o n t h T a g N u m curve shows the total numbers of tags used by the user in every time cell; that is, the p e r M o n t h T a g N u m represents the \(f_{u_{q}}(T_{t})\).

Fig. 2
figure 2

The statistical results of numbers of tags used by user ID=103 in the CiteULike dataset

Fig. 3
figure 3

The statistical result of numbers of tags used by user ID=410 in the Last.fm dataset

Let us observe the results in Table 2 and Fig. 2. First, let us see Fig. 2, it is clear that the user ID=103 does not tag from the 1st month to the 5th month, whose total number of tags is zero. The statistical results show the user’s tagging status is the third case (namely the dormant status) during this period. From the 6th month to the 56th month, the t o t a l T a g N u m (user’s total number of tags) increases, and the increasing is sharply especially after the 21th month; which shows the user’s tagging status is the first case (namely the growing status) during this period. From the 57th month, the user’s total number of tags increases slowly, but not zero for the most part; which shows the user’s tagging status is the second case (namely the mature status) during this period. Then, let’s observe the results in Table 2, which give almost the same results as the statistical results. Similarly, the results in Table 3 and Fig. 3 are also coincident.

Furthermore, more statistical results show that the results of the DUTS algorithm are coincident with the facts in most cases, which means the DUTS algorithm indeed finds out the tagging status of users.

6.3.2 Calculating parameters

According to the DUTS algorithm, we divide the train set into three parts: the growing subset, the mature subset and the dormant subset. Then, we conduct the corresponding tag recommendation policy for these three subset with different parameters, and record the thresholds when F @ n is the best one.

Meanwhile, we also conduct the most popular tags ρ-mix algorithm in Jäschke et al. (2008) with different values of ρ in the train set to find the proper ρ which makes the best F @ n.

The detailed experimental processing is given only on the CiteULike dataset in this subsubsection; for the Last.fm dataset, we just give the final results of parameters.

ᅟ:

(1) Results of TR-GS in the Growing Subset

We adopt the TR-GS algorithm to recommend tags for the each user-resource pair (u q ,r q ) in the growing subset. And, the number of neighbors of the resource r q , S 1 is set to be 5, 10, 15, 20, 25, 35, 45 or 55, and the β is set to be 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1, respectively. In order to save space, Table 4 just shows the performance of tag recommendation under some parameters in the growing subset.

Table 4 Results of TR-GS in the growing subset with (S 1,β)

As indicated in Table 4, the performance of tag recommendation changes with the value of S 1 or β. When S 1 = 45 and β = 0.6, F @ n of TR-GS is the best; that is, the performance of tag recommendation is best.

ᅟ:

(2) Results of TR-MS in the Mature Subset

We adopt the TR-MS algorithm to recommend tags to the each user-resource pair (u q ,r q ) in the mature subset. In this experiment, γ is set to be 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, respectively. To make the result more clearly, Table 5 shows the performance of tag recommendation when γ is 0.3, 0.4, 0.5, 0.6 ,0.7 and 0.8.

Table 5 Results of TR-MS in the mature subset with γ

As indicated in Table 5, different γ would bring different influence result. When γ = 0.5, the F @ n of TR-MS is the best one.

ᅟ:

(3) Results of TR-DS in the Dormant Subset

We adopt the TR-DS algorithm to recommend tags to the each user-resource pair (u q ,r q ) in the dormant subset. In this experiment, the number of neighbors S 2 of resource r ( q) was set to be 5, 10, 15, 20, 25, 35, 45, or 55, respectively. Table 6 shows the performance of tag recommendation on the dormant subset.

Table 6 Results TR-DS in the dormant subset with S 2

As indicated from Table 6, different S 2 would bring different results. When S 2 = 10, F @ n of TR-DS is best and the performance of tag recommendation is best.

ᅟ:

(4) Results of the Most Popular Tag Algorithm

We adopt the most popular tags ρ-mix algorithm to recommend tags to the each user-resource pair (u q ,r q ) in the train set. According to Jäschke et al. (2008), ρ is set to be 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1 respectively. To make the results more clearly, Table 7 shows the performance of tag recommendation when ρ is 0.5, 0.6, 0.7, 0.8, 0.9 and 1.

Table 7 Results of the most popular tag algorithm with ρ

As indicated in Table 7, different ρ leads to different result. When ρ = 0.9, F @ n is best and the performance of tag recommendation is best.

In conclusion, we can determine the appropriate parameter used in the corresponding subset of CiteULike according to the results from F @1 to F @5. For the growing subset, S 1 = 45 and β = 0.6; for the mature subset, γ = 0.5; for the dormant subset, S 2 = 10; and for the most popular tags algorithm, ρ = 0.9.

In the same way, the best parameters of the Last.fm dataset are determined, too. That is, S 1 = 5 and β = 0.9 for the growing subset; γ = 0.5 for the mature subset; S 2 = 5 for the dormant subset; and ρ = 1.0 for the most popular tags algorithm.

The parameters used in the FolkRank algorithm proposed in Kim and El Saddik (2011) are set as the same as used in the reference, i.e., d = 0.7, and the number of iteration is 10, and the corresponding weight of preference vector is set to be 1 + |U| and 1 + |R|.

6.3.3 Results and analysis

To make the comparison objectively, we use the best parameter calculated above for TR-UTS and the most popular tags ρ-mix algorithm in the following experiments.

ᅟ:

(1) Validation of User Tagging Status

To validate the effectiveness of the proposed definition of user tagging status, we compare the results of TR-UTS, TR-GS, TR-MS and TR-DS in the test set of CiteULike and Last.fm datasets. Tables 8 and 9 describe the comparison results.

Table 8 Results of TR-UTS, TR-GS, TR-MS and TR-DS on CiteULike dataset
Table 9 Results of TR-UTS, TR-GS, TR-MS and TR-DS on Last.fm dataset

From Table 8, we can see that the F @ n of TR-UTS is better than TR-GS and TR-MS, and is much better than TR-DS. The mean value of F @ n of TR-UTS is heightened 1.71%, 2.58% and 21.78% than the other three algorithm respectively. This indicates that the tag recommendation strategies proposed in this paper, determining user tagging status firstly and choosing different tag recommendation strategies under different status, has a better performance than the single strategy. Meanwhile, the performance of TR-GS is slightly better than TR-MS, and much better than TR-DS, which infers that it is efficient to consider the tags of users in the same group and similar resources to recommend tags. From Table 9, we can see that the F @ n of TR-UTS is the best one when n ∈{3,4,5}; when n = 1 or n = 2, the F @ n of TR-UTS is slightly less than the TR-GS, but much better than TR-MS and TR-DS.

In conclusion, the effectiveness of TR-UTS algorithm is much better than that of other three strategies. Therefore, we will adopt the proposed TR-UTS algorithm as the further comparison method with other existing methods in the following.

ᅟ:

(2) Comparison Experiments

In order to observe the performance of the proposed method, we test the proposed TR-UTS algorithm with FolkRank (Kim and El Saddik 2011), LocalRank (Kubatz et al. 2011) and the most popular tags ρ-mix algorithm (Jäschke et al. 2008) in the CiteULike dataset and Last.fm dataset.

To make the comparison objectively, the following experiments use the best parameters obtained in Section 6.3.2 for both the TR-UTS algorithm and the most popular tags ρ-mix algorithm. Thus, Popular 0.9-mix is short for the most popular tags ρ-mix algorithm.

(a) The comparison results of CiteULike dataset is shown in Tables 1011 and 12.

Table 10 P @ n of algorithms on CiteULike dataset
Table 11 R @ n of algorithms on CiteULike dataset
Table 12 F @ n of algorithms on CiteULike dataset

According to Tables 10 and 11, TR-UTS is slightly worse than LocalRank only at P @5 and R @5, but TR-UTS is much better than the other three contrastive algorithms at any other P @ n and R @ n.

According to Table 12, the mean of F @ n of TR-UTS is 18.25%, 7.14%, 1.45% bigger than that of FolkRank, popular 0.9-mix and LocalRank respectively. Especially for F @1, the performance of TR-UTS is obviously better than the one of FolkRank and Popular 0.9-mix, with the increment is 46.56% and 20.29% respectively.

(b) The comparison results of Last.fm dataset is shown in Tables 1314 and 15.

Table 13 P @ n of algorithms on Last.fm dataset
Table 14 R @ n of algorithms on Last.fm dataset
Table 15 F @ n of algorithms on Last.fm dataset

According to Tables 13 and 14, the performance of TR-UTS is better than the other three contrastive algorithms at R @ n. For P @ n, the performance of TR-UTS is a bit worse than the comparison algorithms at P @1 and P @2, but better than the comparison algorithms at the other cases.

According to Table 15, the mean of F @ n of TR-UTS is 1.46%, 2.04%, 5.13% bigger than that of FolkRank, Popular 1.0-mix and LocalRank respectively. Especially for F @3, the increment of TR-UTS is 3.93%, 5.38% and 6.47% respectively.

To provide overall review of the experiments on CiteULike dataset and Last.fm dataset, Figs. 4 and 5 clearly show the performance of different tag recommendation algorithms on F @ n.

Fig. 4
figure 4

F @ n of Each Algorithm on the CiteULike dataset

Fig. 5
figure 5

F @ n of Each Algorithm on the Last.fm datase

The above experimental results indicate that TR-UTS is obviously better in tag recommendation performance than FolkRank, the most popular tags ρ-mix algorithm and LocalRank.

7 Conclusions

This paper presented a novel method based on user tagging status to improve the quality of tag recommendation in folksonomies. The paper first introduced three types of user tagging status, after analysing the statistical results of the total number of tags used by a user during a period of time. At one moment, a user’s current tagging status could be one of these three tagging status, namely the growing status, the mature status and the dormant status. Afterwards, the paper presented the determining user tagging status algorithm. Then, different strategies were developed with regard to the different user tagging status, by computing tag probability distribution in users’ and resources’ tag space based on the statistical language model. By contrasted with FolkRank, LocalRank and the most popular tags ρ-mix algorithms, the results of the proposed algorithm is better in accuracy, which is also to validate the effectiveness of the concepts of user tagging status introduced by this paper.

In this paper, the user tagging status is determined by analysing the historical tagging behavior of the user, but the backward time τ is fixed in advance. Developing the user’s interests model is helpful to resolve the problem, which is one of our future work. This paper clusters users using the existed friendships in the folksonomy system, to develop an approach of clustering users is another direction for further research.