1 Introduction

At the present time, we have a massive amount of data on the social websites, and finding the appropriate resource based on the user’s preferences is getting more attention Milicevic et al. [20] and Fang et al. [9]. According to Twitter recent statistics, we have more than 329 million users, and 930 million people are using WhatsApp with more than 340 million active daily users. Researchers started to employ social tagging known as Folksonomy to solve the problem of finding the suitable resource for each user query according to his taste and preferences, Alhamid et al. [3], Deshpande and Karypis [6], and Doerfel et al. [8].

Users of Folksonomy are not tied to specific hierarchy or structure, they are permitted to use any word or a combination of multiple words to describe and annotate resources Huang et al. [14], Pirolli and Kairam [22] and Hossain et al. [13]. Due to that flexibility in Folksonomy, it can be used as good tool to organize and share resources on the web see Fig. 1.

Fig. 1
figure 1

Folksonomy

Some studies, Yang et al. [27] analyzed social media and mainly the tags initiated by a number of users and discovered that a pile of media tags has enough information to identify and label the key ideas of content in the media. Therefore studying collaborative tagging –Folksonomy– as a tool to improve personalization in social media recommendation is very crucial. Keyword query is the most dominant search pattern in most websites, due to its easiness and efficiency. But most users of different domains might have different view and understanding for the same item or tag Yang et al. [27]. This triggers the need for a tag-based personalized search to answer the search query according to each user’s taste. Therefore Google, Ping and other search engines have shifted towards the personalized search in their algorithms.

The work done in this paper, explore the social collaborative tagging as a means to expand and enhance the social media recommendation. In this model we are trying to throw the questions below:

  • Question A. How to propose users with proper tags during annotation/recommendation process?

  • Question C. How to achieve different weight score for each searched item?

Most of the researchers tried to answer one of these questions but not both of them at the same time in their proposed models. Mainly they are trying to compute the tag-tag similarity, item-tem similarity and user-user similarity. The missing factor is to combine all these similarities and to consider the situation where there are no enough tags, or the new users who just joined the social media site or community. Even though Folksonomy is a good tool for personalized search, it suffers some restrictions and limitation. The problem arises when we have items annotated with few tags or we have new users who never tagged any items in the past, this problem is known as cold start users/items. To deal with this limitation we need to combine or aggregate all users and items whenever we compute the item or the tag weight. Another limitation occurs when some users label an item with precise tag while other users use a generic tag. Some users use vague or unclear tags which results in synonymous tags.

Enriching media recommendation through social collaborative tagging is the goal of this work. We compute first the tag-item weight model with respect to similar tags and the user-item weight model with respect to similar items. Then we combine both weights to answer the user query according to his preferences, so we have a personalized result query.

Sections of this paper are: Related work to social tagging is introduced in section 2. In section 3, we covered folksonomy types and representation. Section 4 presents the Pearson’s chi-squared similarity model. The main ranking model is introduced in Section 5. The experimental results displayed in Section 6, and the conclusion in Section 7.

2 Related literature

We will discuss a number of research papers related to our problem.

2.1 Annotation

Stone et al. [24] proposed rank images from collaborative tagging systems based on the SVD model. A similar model was introduced by Krestel et al. [18] to study the latent topic and to apply the Dirichlet Allocation (LDA) to extract from folksonomy the hidden items. Another approach was proposed by Diaz-Aviles et al. [7] to rank items by extracting g the latent topic using the LDA technique. Wattenberg et al. [25] determine the annotation item using data visualization model. The user can annotate the item base on the previous history.

2.2 Search

The author Xie et al. [26] presented a framework to integrate different weighted score for the searched item based on the user’s need. Ifada and Nayak [16] interprets the observed tag and non-observed tag and ranks them according to the graded-relevance interpretation scheme. Mao et al. [19] tackle the tag-based recommendation by studding the users tag co-occurrence based on which link the PageRank algorithm uses to transform the score of the items into recommendations. Balakrishnan et al. [4] study the face models to tag photos and then search the tagged photos. The author proposed an AutoTag application as proof of concept. Another work was done by agharwal et al. [1], to solve the sematic query gap, due to the wide class variance, and inadequate vocabulary. Ha et al. [10] proposes a new scheme to enhance the tag-based image retrieval accuracy. Images in this scheme are grouped by the semantic similarity and information produced by the folksonomy.

2.3 Recommendation

Zhao et al. [29] modeled the tagged item and the user’s profile as graph-based ranking of multi-type interconnected items. The tags are sorted according to the GRoMO score and recommend accordingly to the user. Another work was introduced by Zhao et al. [30]. A fast recommendation algorithm by uncovering the non-overlapping user clusters and corresponding overlapping item clusters simultaneously. Similar work to use singular value decomposition to enhance tag recommendation by Min et al. [21]. The Tags weight is computed according to the number of user who used this tag and according to the tagged locations. Chen et al. [5] mine the semantic information of tags for each item and user and then employ the semantic information in matric factorization to uncover the semantic information between users and items. Another work has been proposed by Zhenzhen et al. [31]. The paper introduces a multi model item recommendation based on user similarity. It computes the users trust relation by adapting the transfer matric into random walk.

3 Study of folksonomy

3.1 Social websites

We are becoming addict to the social websites, and users of different ages are using them on daily basis, due to the benefits and advantage of this websites. Social websites allow users to collaborate with other people to build social relations to share interests and advertise for their businesses. Figure 2 shows some example of the social websites.

Fig. 2
figure 2

Some examples of the social websites. http://www.blog.skytopper.com/tag/social-networking

3.2 Broad and narrow folksonomy

Based on who has the right to tag or annotate the item we can classify Folksonomy into broad and narrow.

Hassan-Montero and Herrero-Solana [12]. The left part of Fig. 3 shows broad Folksonomy. Even though the creator did not use Tag 3 but he can still retrieve item using Tag 3. Delicious is an example of broad Folksonomy and Flicker is an example of Narrow Folksonomy.

Fig. 3
figure 3

Broad and narrow

4 Similarity model

Before we introduce our model, Table 1 will list all the terms and their associated meanings to be used in this paper.

Table 1 Terms and their associated meanings

Most studies Qian et al. [23] treated Folksonomy as 3-dimensional space, which span users, tags, and items. The 3-dimentional Folksonomy can be reshaped into three 2-dimensional spaces. The 2-dimensional space displays the following relations:

  • User-tag

  • User-item

  • Tag-item

From the above relations different similarities can be computed among tags or items or users. Depends on the relation used to calculate the tag-tag similarity or item-item similarity or user-user similarity, we yield different similarity scores. In a previous work we employed cosine score to measure the similarity among tags and items. In this paper the similarities are characterized using the homogeneity score which equals to the Pearson’s chi-squared goodness-of-fit test for independence, Agresti and Kateri [2].

The intuition of using the homogeneity score is explained through applying the Pearson’s chi-squared test to the tag-item matrix. The test value calculated below is used to determine whether there is a significant association between tags and items. The higher the score the more evidence we have to conclude a strong association between tags and items. This means that tags frequencies are utilized to recommend items occurrences. On the other hand, low scores imply that items and tags are independent, which means knowing tags frequencies will not provide accurate items recommendations.

4.1 Tag-tag

Given the tag-item relation with dimension |I| items (rows) and |T| tags (columns), we calculate the tags connection, to investigate items tagged with related tags:

  1. 1.

    Find \( {R}_i={\displaystyle {\sum}_{j=1}^{\left|T\right|}{d}_{ij}} \) be the ith row sum, where i = 1, …,|I| and \( {C}_j={\displaystyle {\sum}_{i=1}^{\left|I\right|}{d}_{ij}} \) where j = 1, …,|T|

  2. 2.

    We construct the relative difference matrix \( \widehat{D}=\left[{\widehat{d}}_{ij}\right] \), where \( {\widehat{d}}_{ij}=\frac{{\left({R}_i{C}_j-{d}_{ij}\right)}^2}{R_i{C}_j} \)

  3. 3.

    Calculate the homogeneity score, H x , y , between tag x and tag y as follows

$$ {H}_{xy}={\displaystyle \sum_{k=1}^{\left|I\right|}{\widehat{d}}_{xk}{\widehat{d}}_{ky}} $$

The H x , y represents the homogeneity similarity score between tag x and y. We denote the tag-tag similarity matrix as T t M.

4.2 Item-item

To form the relation between related items, we employ the user-item relation W with dimension |U| users (rows) and |I| items (columns), we calculate the following:

  1. 1.

    Find \( {R}_i={\displaystyle {\sum}_{j=1}^{\left|I\right|}{w}_{ij}} \) be the ith row sum, where i = 1, …,|U| and \( {C}_j={\displaystyle {\sum}_{i=1}^{\left|U\right|}{w}_{ij}} \) where j = 1, …,|I|

  2. 2.

    We construct the relative difference matrix \( \hat{W}=\left[{\hat{w}}_{ij}\right] \), where \( {\hat{w}}_{ij}=\frac{{\left({R}_i{C}_j-{w}_{ij}\right)}^2}{R_i{C}_j} \)

  3. 3.

    Calculate the homogeneity score, H a , b , between item a and item b as follows

$$ {H}_{a,b}={\displaystyle \sum_{k=1}^{\left|U\right|}}{\hat{w}}_{ak}{\hat{w}}_{kb} $$

The H a , b represents the homogeneity item-item similarity score between two items a and b. We denote the item-item similarity matrix as I i M.

5 Recommendation model

First we need to compute the similar items to a given item model in addition to the similar tag to a given tag model. To construct the first model, which represents the user-tag preference model we need to utilize the product of the following relations:

$$ {\mathbf{L}}_{\mathbf{U}\mathbf{T}}={\mathbf{U}}_{\mathbf{t}}\mathbf{M}\times {\mathbf{T}}_{\mathbf{t}}\mathbf{M} $$
(1)

where U t M is a normalized user-tag relation, and T t M is related tag-tag relation. This idea behind this model is to uncover similar tags to a given tag assigned by a specific user to an item. We normalize the user-item matrix to reduce the effect of the most popular tags labeled by frequent users. The new user-tag weight among all users and tags is displayed in Fig. 4.

Fig. 4
figure 4

The user-tag preference model

To construct the second model, which represents the related tags to given tag, we need to utilize the product of the following relations:

$$ {\mathbf{A}}_{\mathbf{T}\mathbf{I}}={\mathbf{T}}_{\mathbf{i}}\mathbf{M}.{\mathbf{I}}_{\mathbf{i}}\mathbf{M} $$
(2)

where T i M is a normalized tag-item matrix, and I i M is an item-item similarity matrix. The latent tag model uncovers similar items to a given item labeled by a certain tag. We normalize the tag-item matrix to maximize the effect of the rarely tagged items. Fig. 5 below presents the tag-item weight model A TI .

Fig. 5
figure 5

Latent tag annotation model

Due to the flexibility of folksonomy, each person has his own taste to annotate the resource or an item with any arbitrary word. Therefore retrieving the appropriate item according to the user’s need is an important part that needs to be integrated in our model. To build the tag-based search model we employ the user-tag weight model L UT and the tag-item weight model A TI . Given a specific user who submits a query q which consists of one or more tags, the ranking score of an item is computed as:

$$ {S}_u\left(i,J\right)=\frac{1}{1+{e}^{-{\displaystyle {\sum}_{t\in J}{L}_{ut}{A}_{ti}}}} $$
(3)

Where S u (i, J) represents the personalized ranking score with respect to user u for an item i with a set of query J. The model uncovers user’s likely items even if the items are not labeled with the submitted query tags. Also the new users who never tagged any item or an item that has never been labeled – cold start problem – can be processed using the proposed model, since the model has the capability to uncover similar items and tags even if they are new.

6 Experimental results

6.1 Datasets

We downloaded 2 real datasets to validate our ranking model. The first dataset comes from Flicker,Footnote 1 a photo management and sharing application that allow users to upload, tag and share their photos online. The dataset contains 206,564 photos from 58,199 different photographers, and 11,386 tags, Huiskes et al. [15]. MovieLensFootnote 2 is the second dataset, which is an online sharing movie site. It allows users rate and tag the movies and generates personalized movie predictions. The MovieLens dataset contains 5580 tags applied to 10,681 movies by 71,567 users, Harper et al. [11]. We pruned both datasets and projected them in 3 two-dimensional matrices. We applied the frequency weight in all matrices.

To run the experiments tests, we divided both datasets into validation set and training set. The validation set contains one tag assignment for each user and the rest is in the training set. In order to have accurate results with 95 % confidence interval, we performed each experiment 5 times.

6.2 Evaluation metrics

We employed different metrics to measure the accuracy and coverage of our proposed model.

The first metric is F-Measure.

$$ F\beta =\left(1+{\beta}^2\right)\frac{P\times R}{\beta^2\times P\times R} $$
(4)

where P is the precision, and R the recall. We also tested 2 versions from the F-Measure F 2 and F 0.5

$$ {F}_2=\left(1+4\right)\frac{P\times R}{4\times P\times R} $$
(5)

F 2 adds more importance on recall compared with precision.

$$ {F}_{0.5}=\left(1+{0.5}^2\right)\frac{P\times R}{0.5^2\times P\times R} $$
(6)

While F 0.5 puts more weight on precision. The second metric is the Positive predictive value (PPV) and Accuracy (ACC).

$$ PPV=\frac{{\displaystyle \sum True\kern0.5em positive}}{{\displaystyle \sum Test\kern0.5em outcome}\kern0.5em positive} $$
(7)
$$ ACC=\frac{{\displaystyle \sum True\kern0.5em positive}+{\displaystyle \sum True\kern0.5em negative}}{{\displaystyle \sum Total\kern0.5em population}} $$
(8)

To test the coverage of our model for a given search query, we ran a test to see if the model is capable to compute a ranking score for all items.

6.3 Effect of similar tags and items

Similar tags and items participate in computing the item raking score. Hence we measured F2 and F 0.5 at different values. We started by10 similar tags and items, and then 20, 30, 40 and finally tested all tags and items. As shown in Table 2 the best value was achieved at 20 similar tags/items for both F2 and F 0.5. The large number of similar tags/items has less weight in our models due to the noise associated with the tags/items.

Table 2 Effect of similar tags and items

6.4 Effect of normalization

We investigated the effect of matrix normalization. As mentioned earlier the idea behind the normalization is to reduce the influence of popular tags and common items. The accuracy for 10 similar tags and items were checked, and then increase the number until we have tested all tags and items. We compared the accuracy with normalized matrices and without. Also to show the statistical significance, we performed two-tailed paired t-tests. Table 3 reports the best accuracy value was achieved at the normalized matrices with 20 similar tags/items.

Table 3 Accuracy improvements over a non-normalized approach

6.5 Effect of top returned item

To compare with baseline methods, we measured the positive predictive value PPV. We compared our model with Social rank algorithm, Zanardi and Capra [28], and CUM algorithm, Kim et al. [17]. We examined how each model will behave according to the user query. We started by changing the number of returned item from 1 until 10. We computed the PPV at N = 1, and N = 2 and so on. We recorded which algorithm will position the returned item at higher rank. Figure 6 shows the comparison results; clearly our tag-based models outperformed the other algorithms.

Fig. 6
figure 6

Models comparison

7 Conclusion

This paper tacked the issue of tag-based search, and tag recommendation. We tried to mine the related tags and items while building our ranking model. The experimental results show that our approach ranks the items and tags in a higher position according to the user’s preferences.

As a future work, we can extend our approach to be employed in different contexts such as movie recommendation, smart home, and ambient intelligence environment. Due to the simplicity of our model as it does not require any additional data other than the tagging information. We plan also to test different algorithms when calculating the similarity matrices such as Pearson correlation instead. As mentioned earlier we have problem with synonymous tags sometimes, therefore we plan to combine semantic web with our model and test the improvement in regard to this problem.