1 Introduction

With the exponential growth of World Wide Web, the complexity and size of websites are growing accordingly. The data generated by users is indeed large in quantity that gives rise to big data and problem of information overload. Recommendation is a typical challenge in artificial intelligence. E-commerce websites present users a huge collection of items, which makes it difficult for users to choose relevant items from the given alternatives. This is the problem of information overload. A recommender system (RS) handles the problem of information overload by automatically recommending items that interest a user. An RS is a subclass of information retrieval system that evaluates the preferences of user to predict a rating for an item. The primary objective of any RS is to reduce complexity by filtering a large amount of information and selecting the information relevant to user requirements. Consequently, the system creates a profile for every user and use some recommendation approaches to generate recommendations for them according to their interests.

With the development of social networking websites like Facebook, LinkedIn, and WeChat, social networks have become an indispensable part of our life. The active users on these sites generate enormous data daily. With the availability of rich data, it becomes more essential to search and give effective recommendations. Social networking sites offer enormous user data in different forms such as user profiles, social relations, tags, comments, posts, etc. Recommendation systems use this data to generate useful and effective recommendations. Recently, social relations as an additional information in RS have gained much popularity in improving the quality of RS. It appears more rational to use social network data such as experience, user profile, likes, dislikes, background, belief and knowledge level in RS. Dynamics of RS is an all-important parameter that ascertains user satisfaction and accuracy (Rana and Jain 2015). As user preferences for items change with time, therefore, temporal-based data are significant for enhancing accuracy of RS (Shokeen and Rana 2018a). It is almost difficult to improve the accuracy of RS without considering these effects (Koren 2010).

1.1 Differences between the former surveys and this survey

There is a plethora of research in the sphere of social media-based RS. However, to the best of found knowledge, very few research papers perform a systematic survey of existing works and advancement in this area. This paper presents an extensive outline of research done in social recommender systems (SRS) and identifies open challenges and future directions in this area.

In previous years, plenty of surveys have been conducted in traditional RS but very few of these surveys are based on social recommendation. For example, Guy I (2015) gives the tutorial of the research in SRS. This tutorial covers fundamental recommendation approaches, a few techniques for social media sites, evaluation methods, some issues and challenges, etc. Zhang et al. (2017) has reviewed various deep learning-based RS. However, in this paper, we discuss some deep learning-based social RS. Yang et al. (2014) gives the survey of collaborative filtering-based social RS. On the other hand, Tang et al. (2013) classifies social RS into memory-based and model-based SRS. Schall (2015) gives an overview of social network-based RS. They highlight some techniques for social recommendation, namely, link prediction, partner recommendation, follow recommendation and social broker recommendation. Recently, Shokeen and Rana (2018a) discuss the dynamics and some future challenges of SRS.

1.2 Contribution of this paper

The objective of this paper is to serve as an excellent resource for researchers embarking in the area of SRS. We provide an in-depth overview of different approaches to social recommendation. This paper lays the foundations for innovations in the field of SRS.

The major contributions of this survey are indexed as follows:

  1. 1.

    This survey gives a study of social media, types of social media and the role of social media in recommendation.

  2. 2.

    Based on different techniques, we conduct a systematic review of different types of SRS.

  3. 3.

    We also discuss different approaches, parameters, and datasets taken in building specific SRS.

  4. 4.

    We figure out different domains where experiments are conducted in research papers taken in this survey.

  5. 5.

    We draw a comparison of different research papers based on the metrics used by them.

  6. 6.

    A number of datasets used in different SRS are also presented.

  7. 7.

    Some of the future works in this sphere are also discussed.

The remaining part of this paper is organized in the following manner: Section 2 provides an overview of different social media sites. Section 3 gives the definition and overview of SRS. Section 4 discusses different techniques for designing SRS like deep learning, collaborative filtering, hybrid, fuzzy, clustering, semantic and group. Section 5 presents the classification of shortlisted papers on basis of domain and metrics. Section 6 presents some of the frequently used datasets in evaluating SRS. Section 7 discusses challenges for future research in this area. Lastly, section 8 gives the conclusion of paper.

2 Social media

Social media are the online platforms where users create public profiles, indulge in social interactions and form relationships with other users. Research reveals that more then 80% users use social media out of their Internet usage. It demonstrates that we spend a major part of our daily time on social media sites. Social media has emerged as a powerful tool for e-commerce marketing, multimedia sharing, reviews, building relations, entertainment and many more. There is an impressive range of widely used social media sites. In this section, we briefly list different types of social media sites (Fig. 1).

Fig. 1
figure 1

Social media websites

2.1 Social networks

A social network is a platform, which is all about relationships, i.e., establishing new relationships, strengthening existing relationships and bringing relationships to new levels. The small world phenomenon is an important characteristic of social networks. The first experiments to explain this phenomenon was conducted by Milgram in 1967 (Travers and Milgram 1969). His experiments demonstrate that any two users in a social network are connected via linkage by their neighbors. Social networks are usually classified into two types: general-purpose social networking sites (SNS) such as Facebook and Twitter, and domain-specific SNS such as Epinions for product recommendations and Movielens for recommending movies. We can classify these sites into different classes based on their type such as micro-blogging sites, multimedia sharing sites, entertainment sites, location-based social networks, social review sites, etc.

Recently, with rapid growth in the number of social users, Facebook and Twitter have evolved as the platforms for public relations and marketing. Twitter is a social network where users post messages, which are called tweets.

2.2 Social bookmarking sites

Social bookmarking is a practice for users to search, organize and store bookmarks of their web pages. These are the sites where users use tags while sharing and posting their articles, videos, photos and web pages. Social bookmarking is also termed as social tagging. Some well-known social bookmarking sites are Del.icio.us, Digg, Slashdot, BibSonomy, Twitter, etc.

2.3 Social Review sites

Sites where users post reviews about products, services, business, or people are called social review sites. Examples of these sites are Epinion, Yelp, TripAdvisor, etc.

2.4 E-commerce sites

Social media is an efficient platform for marketing. Amazon is the largest e-commerce website. According to a recent survey of Feedvisor, more than half of U.S. brands sell products on Amazon to maximize their sales. Recently, Kim and Kim (2018) prove that social sharing boosts sales in e-commerce websites.

2.5 Geo-location sites

Geo-location SNS use geographic services, tags, location-based user data to suggest users the places and events that fit their interests. Well-known geolocation-based social networks are FourSquare, Gowalla and Yelp. In case of mobile SNS, user-submitted location or mobile phone tracking empowers location-based services to enrich social media.

2.6 Microblogging sites

Microblogging websites permit people to post small contents such as images, short sentences or links. The popular microblogging websites are Twitter and Tumblr. Some SNS like LinkedIn and Facebook also possess microblogging features. They call this feature as status updates.

2.7 Multimedia sharing websites

These are the social media sites that permit users to upload and share multimedia files such as photos, music and video. YouTube, NetFlix and Vimeo are the video sharing sites whereas Instagram, Pinterest and Flickr are the photo sharing sites. WhatsApp is a social media app that allows users to send and share text messages, links, images, audio, videos, locations, contacts or documents to other users.

2.8 Academic social sites

With the advancement in social media, an unprecedented number of opportunities are available with users to seek out recent ideas and information, connect with colleagues and peers, and disseminate the latest findings to the world. Managing online presence and disseminating scientific contributions and academic-related discoveries has become easier through SNS. There are a number of academic SNS like Academia.edu, Google Scholar, Semantic Scholar, Mendeley and ResearchGate. Apart from these, there are a number of apps for online learning such as edX, Learn Python and Coursera.

2.9 Discussion forums

A discussion forum is a platform where people leave query about any topic or item and expect to see responses to that query. Forums are those social media sites, which are well-suited for longer and richer discussions. The quality of content is longer-running, appropriate and more in-depth. Quora.com is one of the popular question-and-answer sites. Some forums are general-purpose forums like Quora and Vault where any type of question can be raised whereas others are more specific like The Artificial Intelligence Forum.

2.10 Social groups sites

Social grouping sites are platforms where people meet and hang out with people like them. Meetup is a well-known online grouping site that hosts events for people with similar tastes and preferences. A person can belong to multiple groups in Meetup. Other social group sites include Eventbrite and DownToMeet.

3 Social recommender system

RS and social media share reciprocal advantages and the role of one is important for the other to function accurately (Tang et al. 2013). When information from social media is integrated with RS, then the system is termed as SRS. In this paper, we define “social recommender system” as the RS that aims at using social networking data so as to improve the recommendation performance (Shokeen and Rana 2019b). Social networks are reservoir of inexhaustible information. Explicit social networks such as Twitter, LinkedIn and Facebook allow people to express their social connections. On the other hand, implicit social networks like e-mail networks and co-worker networks enable people to infer relationships from user actions. The main goal of social recommenders or SRS is to use social context to harness social connections of users. These systems play a central job in handling information overload problem of social networks and use various techniques to provide most desirable and meaningful items or information to users (Guy I 2015). Social networks use RS to personalize the contents for users and thus improve user experience. On the other hand, social networks improve quality of RS by recommending relevant items to users based on their preferences, relations, and experience in social networks. SRS are known to mitigate the issues of traditional RS like cold-start problem, data sparsity, fraud and trust-related issues (Shokeen and Rana 2019b). Social recommenders seek out information from social media domain. The coupling of social media with RS has generated opportunities for the world of e-commerce where social influence play a major role in product marketing.

SRS harness user trust in addition to extracting similarity between users. TidalTrust of Golbeck (2006a) and MoleTrust of Massa and Avesani (2007) are two such algorithms that are known for trust computation in social networks. An SRS is an improvement over the traditional RS as it employ social user’s trust and user’s interest from social networks (Sun et al. 2015). For instance, by reason of trust, one may watch a video recommended by a social friend on YouTube. Due to social interest, one may read the publications of a close family member who has just attended some conference. Fan et al. (2018) prove that social relations are useful for improving the recommendation performance. Wang et al. (2013) illustrate the advantages of recommendations from online social networks. Interest-oriented and influence-oriented are the two forms of social media recommendations. Recently, Shokeen and Rana (2019b) studies different factors affecting social recommendations. These factors are trust, tags, grouping, heterogeneous social connections, semantics, cross-domain knowledge and time. Shokeen and Rana (2018b) elucidated the influence of dynamic factors in generating social recommendations. Knijnenburg et al. (2012) conduct an online experiment on Facebook music RS to clarify how the recommendations are drawn. Shokeen and Rana (2018a) discuss various dynamics that affect the results of SRS.

In our previous works (Shokeen and Rana 2018a; 2019b), we have explained different types of techniques used in building RS. These techniques are mainly divided into collaborative-filtering, content-based filtering, hybrid, knowledge-based, graph, and demographic filtering.

4 Techniques for social recommender systems

A number of SRS techniques, models and approaches have been proposed and implemented for recommending items in literature. To the best of our knowledge, this paper reviews the algorithms available till 2019. We categorize the techniques for building SRS into deep learning, collaborative filtering, hybrid, fuzzy, clustering, semantic-based and group-based. We are not discussing content-based filtering (CBF) SRS in this paper. As an SRS employs social relationships for which collaborative filtering-based algorithms are excessively used, CBF techniques are generally combined with other techniques for recommendations. This section describes algorithms under each of these categories as follows:

4.1 Deep learning-based social recommender systems

Deep learning has emerged as a sub-field of artificial intelligence. It consists of several processing layers where every subsequent layer extracts more complex features, which are aggregated to process as input for the next layers. Deep learning models work in the same manner as human brain processes information and learns. The deep learning models are trained using supervised or unsupervised learning. Deep neural network, recurrent neural network, deep autoencoders, convolutional neural networks, restricted boltzmann machine are some of the models used in deep learning.

To exemplify how deep learning works, we describe the model autoencoder. An autoencoder is a kind of feedforward neural network and works in unsupervised manner to train the network. The aim of an autoencoder is to encode the inputs into a representation so that the output data is similar to the input data. Generally, an autoencoder comprises three layers, namely, input layer, hidden layer and output layer. The input layer is reconstructed at output layer by harnessing the representation obtained from the hidden layer. An autoencoder involves two main steps: encoding and decoding where encoding refers to the transition of data from input layer to hidden layer whereas decoding refers to the transition of encoded data from hidden layer to output layer. Mathematically, we state these transitions as follows:

$$ \varphi: X \rightarrow Y:x \mapsto \varphi(x) = \sigma(Wx+b):=y $$
(1)
$$ \phi: Y \rightarrow X:y \mapsto \phi(y) = \sigma(Wy+b^{\prime}):=x^{\prime} $$
(2)

where W is the weight matrix, b is the bias, X is the set of inputs. During the training the encoder encodes the input data x into latent representation y. The decoder then decodes the latent representation y into output data \(x^{\prime }\) (where \(x^{\prime }\) is equal to x). For example, if we input the vector [1,0,1,0,0] at the input layer, then the autoencoder outputs the vector [1,0,1,0,0] at the output layer. The aim of using autoencoders is to remove noise and reduce distortion. Autoencoders are used to extract latent features (Deng et al. 2017; Ying et al. 2016), reduce dimensionality (Unger et al. 2016) and predict missing ratings (Sedhain et al. 2015) in RSs.

The work done in recent years indicates that deep learning has produced very promising outcomes in RS. A recent survey conducted by Zhang et al. (2017) gives a comprehensive review of deep learning based RSs. On the other hand, Shokeen and Rana (2019a) gives an application-oriented survey of deep learning techniques in RS. Some recent works in this area that have not been covered in the past surveys are included in this section. Further, our work is specific to SRS only.

Deng et al. (2017) proposed DLMF (Deep learning based matrix factorization) as a trust-based method for recommendation in social networks. They use deep encoders to train the initial hidden features of items and users to finally minimize the objective function. Unlike other approaches, DLMF assumes that people trust different friends for different domains. CDR (Collaborative Deep Ranking) proposed by Ying et al. (2016) is a pair-wise framework that uses implicit feedback such as browsing and clicking, to reduce sparsity problem. In this, they harness SDAE (Stacked Denoising Autoencoders) to extract items features representation. CDR integrates the extracted information with the pair-wise ranking model. The complexity of computing latent factors U and V are O(nrK) and O(nrK + sK1), respectively. Here, U = (ui), V = (vj), where ui and vj are the latent factors having dimension K, i = 1,2,…n, j = 1,2,…m and r is the average number of times of user interaction. To update all weights and biases, the complexity is O(msK1). The overall complexity is O(2nrK + sK1 + msK1). Experimental results demonstrate that CDR outperforms CTR (Wang and Blei 2011), CTRank and CDL (Wang et al. 2015a) in terms of ranking prediction.

Recently, deep learning methods have been employed in the cross-domain social recommendation. Wang et al. (2017) propose a Neural Collaborative Social Ranking (NCSR) approach for recommending matching items from the information domain to potential users of SNS. In this, a deep collaborative filtering approach models the user-item interactions. Geng et al. (2015) develop a deep learning framework DUIF (Deep User-Image Feature) to learn features of images and users from very large, diverse and sparse social curation networks. The learned features of images and users are used to compute similarities between them to give useful recommendations. Privacy is an issue in these systems as they rely on user personal information. To address such issue, Dang and Ignat (2017) propose a rating prediction approach called dTrust that works well for both cold-start and warm-start users. It utilizes the topology of the trust-user-item network and leverages a deep feed-forward neural network to combine user relations with user-item ratings for rating prediction. Recently, Fan et al. (2018) use deep neural network to bridge the gap between RS and social relations. They present a model Deep neural network on Social Relations (DeepSoR) that discovers intrinsic, non-linear and complex features from social relations between users. In another and more recent work, Fan et al. (2019) leverage the power of graph neural networks that are capable to learn representations of graphical data. Graph neural networks learn both topology structure and node information through graphs. However, data in SRS is represented through two types of graphs, namely, user-item graph and user-user social graph. Due to heterogeneous relations, a user can be involved in two graphs simultaneously. GraphRec proposed by Fan et al. (2019) is a graph neural network that learns representations from multiple graphs to enhance social recommendations.

The effectiveness of extracting latent features makes deep learning technique extremely superior to other techniques. Table 1 presents various deep learning-based SRS.

Table 1 Deep learning-based social recommender systems

4.2 Collaborative filtering-based social recommender systems

Collaborative Filtering (CF) is a basic technique used in RS, which finds similar users and compares their profiles and interests to recommend items (Ricci et al. 2015). The performance of this technique is largely based on the user space. The success of this technique is revealed by its application in different areas like music, movies and hotel recommendations. Yang et al. (2014) present a broad survey of CF-based social RS and classify the CF approaches into matrix factorization-based and neighborhood-based social recommendation approaches. Matrix factorization is the widely used approaches of CF-based social RS. In CF approaches, model-based approaches work on observed user-item ratings to train the model that can be used to predict the ratings. On the other hand, neighborhood-based approaches directly manipulate the original user-item rating database for rating prediction.

To illustrate CF technique, a working example of movie recommendation that uses matrix factorization approach is as follows: We have a user set U = {u1,u2,u3,u4,u5,u6} and item set I ={Action movie, Thriller movie, Romantic movie and Drama Movie} and movies are rated from 0 to 5 by users as shown in Fig. 2. The aim is to recommend item iI to a user uU based on his/her preferences. The matrix factorization approach divides the user-item matrix into user matrix and item matrix. In user matrix, rows characterize the users and columns characterize latent factors. In item matrix, rows characterize latent factors and columns characterize items. In this case, latent factors of users can be features of users like age and gender whereas latent factors of items can be features of movies like actors and genre.

Fig. 2
figure 2

User-item matrix

Konstas et al. (2009) have experimented social data on Last.fm dataset to improve collaborative filtering. The incorporation of extra social information in the form of friendship and social tags about the user is achieved through Random Walk with Restart (RWR) model. SoRec (Social Recommendation) model is a trust-aware social recommendation approach presented by Ma et al. (2008) to resolve poor prediction and data sparsity problems. SoRec uses probabilistic matrix factorization to factorize the user-item rating matrix and social network graph simultaneously. Recommendation with Social Trust Ensemble (RSTE) (Ma et al. 2009a) is a linear combination of social network-based approach and the basic MF approach. RSTE is an improved and more realistic framework than SoRec (Ma et al. 2008) and easily combines the users’ preferences and their friends’ preferences. Furthermore, Ma et al. (2009b) explained the importance of distrust information to improve social recommendations. They devised two matrix-based factorization methods: RWD (Recommendation With Distrust) and RWT (Recommendation With Trust) to incorporate distrust and trust information into RS.

In social networks, a large number of people only join social networks and do not express ratings. To handle such users and alleviate the cold-start problem, a model-based approach SocialMF is developed by Jamali and Ester (2010). Unlike STE model, this approach incorporates the trust propagation mechanism into matrix factorization method for recommendation in social networks. This model is trained with latent features vectors of items and users. However, SocialMF model differs from STE model in the sense that each user’s feature vectors are dependent on the feature vectors of his directly connected neighbors. Moreover, SocialMF model outperforms STE and SoRec with respect to RMSE and is much faster than STE model. But the downside of SocialMF approach is that the user feature vectors are only influenced by direct neighbors.

In a major advance in 2011, Ma et al. (2011) argue that trusted relationships are quite different from social friends’ relationships. They make the key distinction that a trust-aware RS works on the idea that trusted users have similar tastes. However, this is not always true in SRS as some friends may have similar tastes while some friends may have diverse tastes. They explore the integration of social network information into the recommendation model. The first social regularization model is termed as average-based regularization where taste of user ui is the averages tastes of his friends, which is mathematically represented as follows:

$$ \frac{\alpha}{2}\sum\limits_{i=1}^{m}\mid\mid U_{i}-\frac{{\sum}_{t\in F^{+}(i)Sim(i,t)\times U_{t}}}{{\sum}_{t\in F^{+}(i)Sim(i,t)}}{\mid\mid^{2}_{F}} $$
(3)

where α > 0, F+(i) is the friend list of ui; Ui and Ut denote the tastes of users ui and ut, respectively. Sim(i,t) ∈ [0,1] is the similarity function that indicates the similarity between user ui and ut and \(\mid \mid .{\mid \mid ^{2}_{F}}\) expresses the Frobenius norm. However, this approach is appropriate for friends having different tastes. Therefore, they propose another social regularization term, which they named individual-based regularization:

$$ \frac{\beta}{2}\sum\limits_{i=1}^{m}\sum\limits_{t\in F^{+}}Sim(i,t)\mid\mid U_{i} - U_{t}{\mid\mid^{2}_{F}} $$
(4)

where β > 0. Yang et al. (2012b) extends the idea of Ma et al. (2011) that a user trusts different subsets of friends for different categories of items. They use the matrix factorization approach to propose circle-based recommendation (CircleCon) model for recommendations in online social networks. This model finds the user’s level of expertise in a specific category and gives more weights to such user to be considered for rating items. However, only the category has been considered as contextual information in this approach.

A neighborhood-based approach, Trust-CF-ULF, is proposed in Yang et al. (2012a) to improve top-k recommendations using social network information. This approach combines social network based approach with CF approach by adding user latent features obtained from CF approach into trusted information of social networks. It first finds k1 nearest neighbors of source user and then finds k2 nearest neighbors of the trusted neighbors who do not lie in k1 set. A voting-based algorithm is applied to retrieve the relevant items of the combined user set. A social temporal collaborative ranking (ST-CoR) model is developed by Liu et al. (2013) to address the challenges of context-aware movie recommendation. One of the challenges in CAMRa datasetFootnote 1 is to combine the heterogeneous user feedback for which (Liu et al. 2013) propose a collaborative ranking model to collect diverse user feedback. To handle the dynamic changes in the user’s preference for items, they extend this model to a sequential matrix factorization model. Then, a social network regularization function is introduced to enable users with similar preferences to hook up and interact with each other.

Shen et al. (2016) combines reputation-based trust, social relations and preference similarity to propose an SRS. They developed STR model to analyze the user purchasing behaviour on e-commerce websites. However, this model is likely to give biased results for individuals. Lastly, Gurini et al. (2018) were the first to combine matrix factorization approach with sentiment analysis to help people find interesting users. Table 2 shows different SRS using CF technique.

Table 2 Collaborative filtering-based social recommender systems

4.3 Hybrid social recommender systems

Hybrid techniques combine two or more than two techniques so that features of one technique can remedy the pitfalls of other technique. Typically, this technique fuses the features of CBF and CF techniques to produce better results. Current social recommenders typically combine CF technique with other techniques (e.g., CBF and deep learning) to give improved recommendations (Shokeen and Rana 2019b). Netflix is one of the examples that uses hybrid RS. This website uses CF technique to compare the searching and watching behaviors of similar users and then uses CBF technique to present movies, which are similar to movies that are given high ratings by the target user. Based on the results of both these techniques, it recommends movies to users.

Carrer-Neto et al. (2012) proposed a hybrid tool by adding CF technique in knowledge-based RS for movie recommendation. Knowledge-based RS is used to find instances that match the users’ profiles and CF is used to find similar social relationships from social networks. Capdevila et al. (2016) employ the best features of CF and CBF to recommend locations to users. Since their system is based on geo-location data, they call it GeoSRS. It uses Foursquare, a location-based social network, where a user can write reviews about the places they visit. Both locations and reviews are used as the sources for location recommendation.

To handle the dynamic interests of users, Huang et al. (2014) also combines CF and CBF techniques, like other techniques, to develop a hybrid model. Tags define user’s interests and the frequency of tags usage characterize the degree of user’s interest towards them. CF determines similar users whereas CBF determines similar resources based on tags. Both of these techniques are employed to generate personalized recommendations.

On the other hand, Christensen et al. (2016) leveraged demographic filtering in addition to CBF and CF techniques to propose a hybrid RS for the tourism domain. The system uses social relationships to generate recommendations for both group and individuals. They group the relationships between members of a group into four categories, namely, close relationships, hierarchical relationships, acquaintances and unknown relationships. Weights are assigned to these relationships and these weights are inversely proportional to the degree of influence, i.e., an unknown person is given weight 9 and close person (like father) is given weight 1. The system analyzed these relationships to derive social influence. Another approach that uses a graph-based technique along with CF and CBF techniques is given by Sulieman et al. (2016) for movie recommendation. In this, a user-item bipartite graph is built to extract the user collaboration network. A quite different approach is followed by Hussein et al. (2014). They design a software framework for constructing hybrid and context-aware RS and call this framework Hybreed. Hybreed gives an environment to build hybrid RS taking into account both physical and internal contexts. Table 3 present SRS proposed by different authors using hybrid techniques.

Table 3 Hybrid social recommender systems

4.4 Fuzzy-based social recommender systems

Fuzzy set theory plays a crucial role in information retrieval and decision making. Fuzzy sets (Zadeh 1965) control the vagueness of search words entered by users. Fuzzy sets are known to solve the problems with dynamic changes and behavior. Fuzzy concepts can easily solve the imprecision and subjectivity in information (Shokeen and Rana 2017). The incomplete and imprecise information in fuzzy sets is characterized by membership functions. A fuzzy set F in universe of discourse U is expressed in terms of membership function as follows:

$$ F=\{(x,\mu_{F}(x))|x \in F\} $$
(5)

where x is the element and μF(x) represents the membership function of F in the set U. The membership values lies in [0,1]. One of the main issues in SRS is to determine relationship between users and between users and items. In fuzzy sets, membership functions can compute similarities between users and items by using features of both items and users. For example, trust between users in social networks is not clearly defined. Therefore, it is good to represent trust by fuzzy concept in form of linguistic expressions like low trust, medium trust and high trust. The trust values can be used as the parameters of membership function to enhance the computation of trust in social networks.

A scalable, distributed fuzzy thesauri-based RS by Ghasemi (2012) divides the job among social users to reduce the calculation cost. The system calculates the fuzzy document-term matrix that specifies the relationship between document and terms. A fuzzy-based approach extracts the term-term relationship from the document-relation matrix. The system exploits local documents to generate thesaurus for users. On the other hand, Porcel et al. (2015) present a fuzzy ontology-based RS that uses fuzzy linguistic modeling to decode the trust network between users. The system considers only trustworthy users rather than users with similar rating history. MLIOWA (Majority guided Linguistic Induced Ordered Weighted Average) operator is applied to sum up the propagated trust degrees through different paths.

Most of the previous works on RS have only focused on the content-based system. However, sequential information also gives essential details about the user’s behavior. Recently, a web-based RS is proposed in Katarya and Verma (2017) that considers both content and sequential information based on the user’s usage patterns of web pages. The RS employs fuzzy c-means clustering approach to create soft clusters of users. The top-N clusters determine the most similar users matching with the target user. Lastly, Guan et al. (2018) used the potential of intuitionistic fuzzy sets (Atanassov 1999) to represent uncertain and vague tags. DBSCAN (Density-based Spatial Clustering of Applications with Noise) is used to cluster items in which tags are characterized by intuitionistic fuzzy sets. Table 4 shows SRS based on fuzzy concept.

Table 4 Fuzzy-based social recommender systems

4.5 Clustering-based social recommender systems

Clustering uses the similarity approaches to cluster users or items in different clusters. In clustering-based RS, it is easy to identify the users similar to the target user because the tastes and preferences of users belonging to a cluster are similar, depending upon the context. For example, clustering genre-based movies and then recommending the movies that matches with the genre specified by user. Another influential example of clustering is the content-based recommendation of research papers. Also, users may have multiple tastes and preferences and they may belong to multiple clusters based on their preferences. When users belong to multiple clusters or communities, it gives rise to overlapping communities for which multi-label propagation techniques are applied on user-user graphs (Li et al. 2015; Shokeen et al. 2019c).

Social relationships serve as an additional information to cluster users (Pham et al. 2011). Pan et al. (2012) used the k-nearest neighbor (k-NN) algorithm to improve the accuracy of tagging-based SRS. This algorithm pre-processes the tagging data to optimize the clustering of tags. A selection approach is used to rank the tag neighbors for generating precise recommendations. Zhang et al. (2014) attempted to resolve the cold-start and data sparsity problem by following a very different approach based on cloud computing. They used Bi-clustering and fusion approach and called it BiFu. It uses trivial ratings to identify the items that user dislike. The trivial ratings are then filtered in the user-item matrix to reduce the dimensionality and improve the accuracy of recommendations. A smoothing parameter λ is introduced to distinguish the original ratings and smoothed ratings. The fusion parameter γ is then incorporated to fuse the results retrieved from item-based and user-based CF techniques. They implemented this scheme on an SRS and provided it as a cloud service.

Sheugh and Alizadeh (2015) incorporated trust relationships and user similarity methods to propose a clustering-based method for trust-aware RS. Based on Euclidean distance, a multi-view clustering method is proposed that uses similarity distance and trust distance to group similar users. Guo et al. (2015) leverage multi-view clustering approach similar to Sheugh and Alizadeh (2015) to improve coverage and accuracy of recommendations. Instead of Euclidean distance, they use k-medoids multi-view (MV) clustering that exploits both social trusted relationships and rating patterns to iteratively cluster the users. In addition, a support vector regression method is devised to predict correct recommendations when users belonging to more than one cluster receive are multiple and varied recommendations. Finally, social trust and rating information are employed to develop a probabilistic method to assign clusters to cold-start users. This approach is capable of handling cold start users that makes this approach more practical. On the other hand, Ahmadian et al. (2018b) follow an adaptive neighbor selection approach to devise a social recommendation method called Social Recommendation based on Adaptive Neighbor Selection (SRANS). This method uses a clustering approach to exploit similarity values and trust information between users to calculate the initial neighboring users. For estimation of ratings for unseen items, they use initial neighboring users. A confidence model is also proposed to detect and dismiss useless users from the user set. In their other work, Ahmadian et al. (2018a) focused on the dynamic behavior of users in forming clusters and followed a temporal clustering method to propose a Social Recommender based on Temporal Clustering (SRTC). They classified the user ratings of the user-item rating matrix into two categories viz. liked and disliked, and defined a temporal similarity function based on these two groups. This function gives more weights to recent ratings. Trusted social relations are used to build a trust network and finally trust and similarity values are combined to calculate the similarity weights of the network. The system works well for cold-start users. Table 5 shows different types of clustering-based SRS.

Table 5 Clustering-based social recommender systems

4.6 Semantic social recommender systems

There exist different types of social relationships in social networks. It is important to analyze semantic relationships using the edges in social networks. Extraction of semantic relationships and other information gives more meaningful relations for social network analysis. Experts, who can provide best result, play a vital important role in RS to give intelligent and meaningful results. However, it is challenging to search domain-specific experts. To resolve this challenge, Davoodi et al. (2013) designs a hybrid framework that combines CBF technique with semantic social network-based CF technique. It creates the profiles for experts and then builds a network of those experts. It aims to discover a community of domain-specific experts in the constructed semantic-based social network so that their suggestions can be used to give recommendations.

Golbeck (2006a) presented TidalTrust, a trust inference algorithm that exploits the provenance of trust annotations in semantic web-based social networks. The experiments of this algorithm on FilmTrust proved its success in content filtering and giving personalized recommendations. Sellami et al. (2014) leverage the power of semantic technology to improve the analysis of social networks. Amazon dataset is used to build a semantic social network to compare semantic profiles of items with the semantic profiles of users for generating semantic social recommendations.

Frikha et al. (2015) selected an ontology-oriented approach to explain semantic relationships. They highlighted the need of semantic user profile for making personalized recommendations. Basically, an ontology is a type of semantic graph in which nodes represent the concept and links connect these nodes. A user-interest ontology is leveraged to explain the semantic information. They designed a semantic SRS for Tunisian tourism database to generate useful items for users. On the other hand, Wang et al. (2015b) emphasize on the lifestyles of users rather than social graphs for friend recommendation on social networks. A semantic-based RS called Friendbook is presented, which is based on the friend-matching graph and the similarity metrics to calculate the parallelism between the lifestyles of users. Smartphones are used as the mean to sense and analyze daily routines of users so that they can derive the lifestyles of users. A similarity metric is proposed to characterize the similarity between the lifestyle of any two users. Sulieman et al. (2016) develop a hybrid graph-based semantic SRS. This system exploits the CBF approach to extract semantic information from social networks and the CF approach to extract social information.

It is believed that users like to prefer recommendations of their trusted friends rather than strangers in evaluating their preferences (Shokeen and Rana 2018b). Frikha et al. (2017) worked in this direction and proposed a system to infer trust between any two friends in social networks. In this, an ontology characterizes the relationship between users and their interests and preferences. In contrast to Frikha et al. (2015), a temporal factor is incorporated to characterize the interaction duration between users. To deal with the lack of semantic data in personalized RS in medical tourism domain, an ontology is integrated with a social semantic RS. The results demonstrated that recommendations of trusted friends are more reliable than recommendations of close friends.

On the other hand, García-Sánchez et al. (2018) used an ontology to develop an advertisement RS for social networks. In their work, ontology is used to model the dynamic user profiles and content of ads. A collection of advertisements and the set of users, who are registered on social networks, are used as input to the system. As user preferences are dynamic, therefore, the system adjusts user profile vectors whenever the user clicks on an ad or posts a comment on the website. Recently, Tang et al. (2019) employs semantics in spatial movements for recommending point-of-interest locations to community of traveling users. Social topic of interest are used to identify the similarity between such users. The more a user checks-in to a location, the more the user is interested in that location. This travel-community based recommendation is capable of minimizing the data sparsity problem. Table 6 shows different types of semantic-based SRS.

Table 6 Semantic social recommender systems

4.7 Group-based social recommender systems

Most of the research papers in RS aim at recommending the items to individual users. Sometimes there are sitautions when we intend to recommend items for a group of people rather than to a single user. For example, they may recommend a good movie for colleagues to watch or a restaurant for family members to have dinner or music for gym. Such recommendations are generated based on the aggregation of individual interests of members who belong to the group. Quijano-Sanchez et al. (2013) highlighted the significance of leveraging personality attributes of group members and the relationship between these members to improve the group recommendations. It is easy to infer trust between users in social networks like Twitter and Facebook. Users need not to supply explicit information about who trust whom. The daily interactions between users is an implicit information that can be used to infer trust between users. Quijano-Sanchez et al. (2013) also introduced the concept of memory in recommendation process to improve the user satisfaction level. Traditional group RSs aggregate group preferences, assuming that users are independent individuals. However, they ignore the impact of social relationships and social interactions between users that are crucial for group decision-making process. In real-world scenario, people not only adhere to their own preferences, but also comply with the preferences of their close friends.

A hybrid approach is followed by Christensen et al. (2016) to analyze both group and individual preferences for item recommendation in tourism domain. Nowadays, there is a rapid rise in recommending point-of-interest (POI) locations to users. POIs may range from hotels to parks and restaurants and the increasing usage of location-based social networks like Foursquare and Facebook is playing a major role in recommending POIs. Gottapu and Monangi (2017) experimented on location-based social network to recommend POI locations to groups. They find the groups who visit a particular POI and then generate features for the groups visiting that POI. The features are used to build signature for those groups to explain different attributes like number of users, closeness between them and their relationships, etc.

More recently, a study to combine linked open data with social data is performed by Sansonetti (2019) for POI recommendations. Based on social media activities and ratings of sample images, the target user profile is created and updated. The system extracts user demographic information and tagged pages from his/her Facebook account to approximate user preferences and interests. The retrieved data is processed to build a mapping between Facebook data and DBpedia ontology. The mapping is performed to cope with the problems of noisy and ambiguous data. A set of pictures illustrating different kinds of locations is presented to the user, so that the user can select and click the pictures according to their interest. This way images are used to obtain user feedback to predict interesting POIs (Table 7).

Table 7 Group-based social recommender systems

As people have different behavior in different groups and people tend to get more influenced by other users in the group. Therefore, it is crucial to reach the trade-off of preferences of different people in the group. Zhao et al. (2018) has worked in this direction to make decisions for whole group. A model called geo-social group recommendation (GSGR) is proposed by Zhao et al. (2018). GSGR uses personal preferences, social relationships and social topics as the attributes influencing a group member to select a POI. Different weights are assigned to choices of different members to reach the final decision for group recommendation. Qin et al. (2018) have also used social connections to discover similar users and items for the construction of user-item matrix. They divide the big group into different subgroups and perform Singular Value Decomposition (SVD)-based CF to find the candidate sets in the subgroups. A new aggregation function integrates results from different subgroups into final recommendations. A parameter subgroup density ρ is defined as: \( \rho =\frac {k}{n}\) where k is the number of subgroups and n denotes the number of users present in the group. They performed experiments of their model dynamic connection-based social group recommendation (DCSGR) on two standard datasets to evaluate the efficiency of their model.

Recently, Felfernig et al. (2018) focused on aspects of emotions, personality and group dynamics that play a major part in recommending items to groups. The personality of a user can be measured by number of likes on his/her Facebook posts, followers and followees on Twitter, pictures on Instagram, etc. Similarly, emotions has varied dimensions like happiness, surprise, sadness, fear and disgust. A study by Zheng et al. (2013) proved that leveraging emotions in RS improves its performance.

5 Classification of papers

In this section, we perform a domain-based classification and metric-based classification of the research papers taken in this survey. We also give the tabular representation of these classifications.

5.1 Domain-based classification

It is clear from the survey that most of the research papers focused on implementing SRS in the entertainment sector, especially in recommending movies. Social reviews, e-commerce and geo-location are the domains that have also been used in many research papers for evaluating the recommendation performance. Other areas where research has been conducted are cross-domain knowledge, tourism, images, expert recommendation, article recommendation, web page recommendation, social bookmarking, friend recommendation, advertisement recommendation, etc. Notably, most of the research papers used MovieLens and Epinion datasets. Some of the papers have used multiple datasets for assessment. Table 8 classifies the SRS-based research papers used in this survey into different domains in which they have experimented for evaluation.

Table 8 Classification of selected papers based on domain used

5.2 Metrics-based classification

In this sub-section, we briefly explain the metrics used in the articles selected in this paper. Each recommendation model tries to fix a problem with a distinct objective using a different dataset. Therefore, it is crucial to understand the context before selecting the right metric. Metrics evaluate the ranked list of recommended items. Tables 9 and 10 show the metrics used by each research paper. Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are metrics for measuring the correlation between ratings and predictions. These metrics are used to predict accuracy, that is, to find the distance between actual preferences and predicted preferences of items. MAE measures the average of the absolute differences between the predicted rating and the actual rating, without considering their directions. MAE is computed as follows:

$$ \ MAE=\frac{1}{N}\sum\limits_{i,u}|p_{u,i}-a_{u,i}| \ $$
(6)

where N is the total number of users, pu,i is the predicted rating by user u for item i and au,i is the actual rating by user u to item i. RMSE is computed as follows:

$$ \ RMSE=\sqrt{\frac{1}{N}\sum\limits_{i,u}(p_{u,i}-a_{u,i})^{2}} \ $$
(7)

The lower the values of MAE and RMSE, higher is the recommendation accuracy (Fan et al. 2019). However, RMSE metric gives more weights to large errors as the errors are squared before calculating their average. When large errors are undesirable, RMSE gives better results than MAE.

Table 9 Classification of selected papers based on metrics used
Table 10 Classification of selected papers based on metrics used

In binary classification or binary ratings, precision and recall are used to classify the items into selected and not selected items. In comparison to ratings datasets, binary selection datasets are not sparse as each item is either selected or not selected by the user. Examples of such datasets include iris classification dataset where an image is classified into good or bad and news click steams where the value of visited item is set to 1, otherwise 0. We can classify the recommendation results into preferred and non-preferred recommendations. Preferred recommendations are true-positive (tp), non-preferred recommendations are false-positive (fp), preferred non-recommended items are false-negative (fn), and non-preferred and non-recommended items are true-negative(tn). Recall is the fraction of relevant items retrieved out of the total relevant items. Recall is computed as follows:

$$ Recall=\frac{recommended \cap relevant}{relevant}=\frac{\#tp}{\#tp+\#fn} $$
(8)

On the other hand, precision is defined as the total number of correctly recommended items out of the total recommended items. Precision can be calculated as follows:

$$ Precision=\frac{recommended \cap relevant}{recommended}=\frac{\#tp}{\#tp+\#fp} $$
(9)

When both precision and recall are useful in some proportion, then F-score or F-measure is calculated. When precision and recall are equally important then F-score is termed as F1-score. F-score is the harmonic mean of precision and recall. F1-score is calculated as follows as:

$$ F1-score=2 \times \frac{Precision \times Recall}{Precision + Recall} $$
(10)

Normalised Discount Cumulative Gain (NDCG) metric is used when we want to assign some relevance score to the recommended items (Shani and Gunawardana 2011). When we want to focus on the quality of ranking for evaluation, it is good to use NDCG. For top-k recommendations and h number of hits, NDCGk can be calculated as follows:

$$ NDCG_{k}=\frac{DCG_{k}}{IDCG_{k}} $$
(11)

where \(DCG_{k}=rel_{1} + {\sum }_{i=2}^{k}\frac {rel_{i}}{log_{2}i}\) and \(IDCG_{k}=rel_{1} + {\sum }_{i=2}^{|h|-1}\frac {rel_{i}}{log_{2}i}\)

Coverage metric represents the percent of items (items, ratings and users) in the training data the RS is able to recommend. Area under curve (AUC) represents the fraction of correctly ordered items in ranked list (Liu et al. 2013). AUC is also known as area under receiver operating characteristic. Accuracy represents the percentage of correct recommendations out of total recommended items. Katarya and Verma (2017) used this metric to evaluate to evaluate the accuracy of a web-page recommendation system. It is clear from Tables 9 and 10 that most of the papers have used metrics MAE, RMSE, and Precision for evaluation. RMSE and MAE are the error metrics to predict accuracy of the recommendation models. On the other hand, precision and recall are used to evaluate the algorithms for top-k recommendations. Precision is the fraction of selected items that are relevant whereas recall is the fraction of relevant items that are selected. NDCG evaluates the ranked list of users and grants higher importance to the top users in the ranked recommended list.

6 Implementation

This section provides a comparison of implementation results of a few systems reviewed in this paper. We use Epinions dataset for evaluating results of different SRS. Epinions is a consumer review website where users post reviews about the products. This dataset is social rating dataset and publicly available on its site. For evaluation of recommendation accuracy, we use two widely used metrics: MAE and RMSE. The smaller the values of RMSE and MAE, higher is the prediction accuracy. It is to be noted that a slight improvement in the values of RMSE and MAE has a great impact on the accuracy of predictions.

Table 11 gives the results of implementation of different systems on Epinions dataset. The training and testing dataset is divided into 80:20. We used the experimental results of few algorithms performed by Fan et al. (2019). The performance of dTrust neural network model is best when two hidden layers are used Dang and Ignat (2017). The experimental results of SRANS are different for cold-start users and all users. We present the values of SRANS when neighborhood size is set to 20.

Table 11 Comparison of different social recommender systems

7 Standard datasets for social recommender systems

In this section, we present some of the frequently used datasets by various SRS. We divide these datasets into standard and benchmark datasets. Netflix and MovieLens are the benchmark datasets. MovieLens is the website that recommends movies to users and uses their ratings to custom user’s profile for further recommendations. There are different sizes of available MovieLens datasets: 100K, 1M, 10M and 20M. Epinion is a consumer review where website members decide whom to trust. The integration of trusted relationships and review ratings determines which reviews to be shown to the user. Flixster is a social networking service that uses friendship relations for the movie recommendation. MovieLens, Flixster and Douban are some other websites that also aim at recommending movies to users. However, FilmTrust is the website that integrates semantic SNS with trust values for movie prediction (Golbeck et al. 2006b). Each user gives a rating to their friends which is taken as the trust value. Amazon is the website that contains reviews about different products and their metadata. Last.fm is a music recommendation site based on the music listened, tags and social networking resources. Twitter dataset is collected via Twitter APIs and contains 22 million geo-tagged tweets (Bao et al. 2015). Table 12 shows different datasets used in SRS along with their category to which they belong and their corresponding download links.

Table 12 Some datasets for social recommender systems

8 Future scope and challenges

This section discusses numerous challenges and future works for SRS. With the voluminous size of daily interactions on social networks; veracity of information provided in forms such as trust, distrust, influence between users; variety of information expressed in the form of ratings, likes, relationships, and written reviews; velocity of information flow over social networks, SRS is growing into the case of Big data research. The evolution of networks is also a promising direction for future work.

One of the challenges while proposing an algorithm for social recommendation is to determine the attributes affecting recommendation. As different factors affect RS differently, therefore, how to assign appropriate weights to the attributes is a major task when designing an algorithm. Similarly, temporal validity of items and news must be taken into account during recommendation. For instance, a phrase that used to be correct at one time may become false after a period of time (e.g. “The prime minister of India is Narender Modi” and “The prime minister of India is Manmohan Singh”). Although such facts and news contradict with each other but they can become authentic on supplementing relevant time information. In a social network of billions of users, features and influence of users keep changing and implementing decay factors makes the sparse data more sparse as the former information becomes irrelevant which must be avoided and discarded. It is, therefore, challenging to cope with the problem of changing users requirements and social influence in social networks. Further, the growth of SRS generates transactional, streaming, review and rating data (Aguilar et al. 2017). Applying temporal dynamics on such data imposes challenges in modern SRS.

SRS make use of social relationships either implicitly or explicitly. Correlation between items is another major information that should be taken into account during recommendation. Therefore, integrating item relationships with social relationships is a major future area.

We come across huge amount of data on social media and social networks data which is heterogeneous in nature. This data can be age, gender, wall posts, comments, likes, reviews, interests, status updates, hash tags, number of friends, trust and many more. Some of this data are continuous while others are discrete, some of them are qualitative while others can be quantitative. In addition, some social data is weighted. The future work is to process such heterogeneous and complicated social data.

Social recommendations are mainly based on the fact that user choices and interests are influences by their social friends. It is assumed that users are influenced by their friends by equal degree, however this is a false assumption. A user has different trust values for different friends and this value depends upon the context. Contextual information plays a vital role in decision making. The context information is based on the physical, mental, emotional and social situation of a user. The incorporation of contextual information is essential for generating effective recommendations.

Multi-agent recommendation approaches (Villavicencio et al. 2019) are trending in the age of smart phones where there is a need of an agent to assist user most of the time. The agents perceive information from spatial data, data of groups liked by the user, etc. Such an approach would be advantageous to bring similar people together and fulfill their objectives. Further, with the excessive use of internet on social media, it has become the foremost requirement to understand the privacy issues related to user details. Being online has turned a part of life through social media. Sharing personal details, current location, images, etc. can make us cyber victims. Therefore, privacy is one of the future directions in social recommendation while posting or sharing anything on social media.

Group recommendation is also a challenge due to diversity and dynamism in group members. Social affinity that defines closeness of relationships, dependency and position is an important parameter in assessing group membership.

Preference elicitation is an important area of research in decision making (Felfernig et al. 2013). Eliciting user preferences through choices, memory and ratings are some parameters that governs the reliability of RS. Another challenge faced by RS is choice overload. Rs reduces information load but a large set of personalized choices, on the other hand, causes choice overload. According to the study of Jilke et al. (2016), more attractive choices reduce the motivation to choose an item. Therefore, some psychology is required to fix the choice difficulty problem. Lastly, personal factors like mood, emotions and personality also affects the decision-making process of users. Personality-based RS are more efficient than non-personality based RS. They also retain the loyalty of users towards the system.

9 Conclusion

Social media has emerged as a fuel for marketing, recommendation systems and analyze social relationships. SRS is a key research field that has pulled the attention of practitioners, researchers and academicians. This paper surveys various techniques and metrics used by different researchers in designing SRS. The paper also lists the domains and datasets where the aforementioned techniques have been applied. The future trends in this area are also discussed. It is clear from the survey that most models leverage ratings, social relations, and other social data to improve recommendation performance. We hope that deep learning would be the next technology for SRS. Significant growth is seen in the success of deep learning techniques in previous years. The strong expressive power of neural networks serve can be utilized in social recommendations to model social relationships and to learn latent, complex and non-linear social relations.