1 Introduction

The recommender system is a useful tool that assists users by providing them with choices of appropriate items according to their interests. This system has been used in various genres like music, books, movies, jokes, news articles and others. It is used to deduce information about users’ interactions with items and preference between users and items so that recommendations can be done personally according to users’ wants and needs (Feng et al. 2015; Choi and Suh 2013). In the world of consumer recommender systems, researchers require benchmark recommendation datasets to evaluate their technique and demonstrate experimental results. In requiring this, a number of public recommendation datasets that have already been released are employed to help researchers to extract information on the preference of users for a set of items and evaluate the performance of different techniques and algorithms proposed in recommender systems. The users’ ratings are monitored, and their activities such as the kind of the music they listened to, books that they read and Web sites they visit are analyzed implicitly in the datasets. These are done in order to develop new techniques and to evaluate them against the existing recommendation techniques (Li and Chen 2013; Langseth and Nielsen 2015).

Collaborative filtering (CF) is one of the most popular recommendation techniques which achieve efficiency from the similar measurement of users and items. The following four steps describe the trend of CF technique (Bauer and Nanopoulos 2014; Najafabadi and Mahrin 2016):

  1. i.

    Collect user’s ratings of available items (e.g., movies, CDs or books) in user profile (user rating database) in order to show the users’ preferences in the corresponding domain.

  2. ii.

    Identify a set of users (known as neighbors) who are similar to the active users. CF evaluates the similarity between users based on their ratings on common items in the user profile. For example, either they have given similar rating on available items or they have used similar items.

  3. iii.

    Predict the rating on products that active users would give by observing the ratings of neighbors of active users. Notice that, when trying to predict the rating on a specific item, there will be many vacant ratings of the product in neighborhood of active users. In other words, a significant number of neighbors might have not rated the product; therefore, mechanisms should be developed that enable ratings to be predicted on products based on minimum number of ratings.

  4. iv.

    Find products that the active user is interested in based on interests of like-minded users.

Unfortunately, CF may lead to poor recommendation when users’ ratings on items are very sparse in comparison with the huge number of users and items in user–item matrix (data sparsity problem). In the case of a lack of user rating on items, implicit feedback based on users’ activities is used to profile a user’s item preferences. Implicit feedback can indicate users’ preferences by providing more evidences and information through observations made on users’ behaviors. So, this paper is aimed at giving a thorough review on the CF-based recommender systems that are assessed using the public databases. It summarizes the techniques used to improve the user feedback (explicit and implicit feedbacks) and also summarizes the existing recommender techniques. Therefore, this paper helps researchers who are interested in conducting further research in CF-based recommender systems.

There are literature reviews on recommender systems that have been published. However, these articles focus only on a specific domain of recommendation systems development or recommendation approaches. None of these articles concentrate on the comprehensive analysis of recommendation datasets with their specific characteristics as common benchmarks that form the base for researchers. For example, the study done by Adomavicius and Tuzhilin (2005) provided an overview of recommendation approaches including collaborative filtering, content-based and hybrid approaches. They presented the limitations of these recommendation approaches and described possible extensions that could enhance performance of recommender systems. Bobadilla et al. (2013a, b) presented an overview of group recommendation techniques, fundamental recommendation, social filtering and recently developed techniques including the bio-inspired and location-aware recommendation techniques. Park et al. (2012) cluster 210 articles on recommender system areas by their application fields, year of publication, the journal and data mining techniques. Burke (2002) evaluated the landscape of actual hybrid recommendation techniques. Burke also provided a review on hybridization methods in recommender systems.

It has been identified that there is no study so far that has been published or written that reviews collaborative recommender systems from the requirements of current public databases. It is hoped that this research provides the knowledge on recommender system datasets for researchers in order to select the most suitable datasets according to the research purpose. Hence, this study describes the artificial intelligence and data mining techniques that could improve CF recommendation capabilities which focus on improving users’ feedback. It distributes the selected studies by current public databases, by year of publication, by assorted recommendation approaches/techniques in a range of application fields and by historical record of users’ activities used. This study provides valuable insights and acts as a guide for industrial practitioners and researchers. The main contributions of this study are as follows:

  1. (1)

    This research analyzes the studies on collaborative recommender systems from the requirements of current public databases, which show the distribution of articles in public application domains including movies, Web pages, books, music and jokes;

  2. (2)

    Most importantly, this research systematically examines and classifies the studies conducted by current public databases, which provide a list of the historical records of users’ activities referenced most often in the literature;

  3. (3)

    This research classifies the articles by their year of publication, by assorted recommendation approaches/techniques in a range of current recommendation dataset and by historical records of users’ activities used.

  4. (4)

    For each application domain, it analyzes the research achievements on CF recommendation systems and effectively classifies the studies conducted by techniques used in the domain. This will directly support and motivate the practitioners and researchers with the application of recommendation techniques in different domains and provide them with a scheme of recommendation techniques.

The report of this review is organized according to the following sections: A summary of recommender system datasets is depicted in Sect. 2. Section 3 presents the research methodology involved for conducting critical review on existing research articles which have employed CF recommender systems. Section 4 provides a method for classifying articles based on recommendation techniques and public databases used. The findings of this classification will be discussed in this section. Section 5 provides significant implications of the results. Finally, the conclusion is provided in Sect. 6.

Table 1 Description of the public recommender system datasets

2 Summary of public recommendation datasets

This paper thoroughly reviews the collaborative filtering-based recommender systems evaluated using public databases. A number of publicly available recommendation datasets have been employed, which help researchers to extract information on the preference of users for a set of items and evaluate the performance of their proposed recommendation techniques. At the beginning of testing newly developed recommendation techniques, the following preprocesses are conducted. This is a widely used preprocessing technique in the area of evaluating the recommender systems (Horsburgh et al. 2015; Briguez et al. 2014; Najafabadi et al. 2017).

  1. (1)

    The first task is to select the datasets. The experiment will be conducted using a public dataset which is a freely accessed database. This experiment includes separating the dataset into training and test portions. The training portion of the dataset is used to learn different experimental parameters (80% of the dataset). The test portion (20% of the dataset) is used to evaluate the training technique. The researchers can replicate experiments to investigate and improve their techniques.

  2. (2)

    The second task is to identify a suitable evaluation metric. An evaluation metric is important for the experimental verification process. In order to claim how well our technique works (improved predictive utility), we employ a metric that is well understood, widely used and reproducible. The evaluation metric measures the quality of a recommender system by measuring how close a recommender system’s predicted ranking of items for a user differs from the user’s true ranking of preference. Also, it should measure how well a system can predict an exact rating value for a specific item or evaluate the accuracy of the proposed technique by measuring ratio of recommended items that are relevant.

Table 1 depicts the information in work proposed by Najafabadi and Mahrin (2016) and lists the characteristics of each public dataset. These datasets have diverse meta-information and various types of users’ preferences according to five domains which are movies, jokes, books, Web pages and music tracks. Delicious and MSD incorporate implicit feedback and social information (grasp users’ interests from social networking activities such as tagging, making social comments and music listening information of the users to boost the recommendation (Huang et al. 2014)). Social tags on items are valuable implicit sources of information about the contents associated with the items to represent the users’ interests and preferences. Therefore, tagging information can be used to enrich item profiles and user profiles in improving the recommendations. In MSD, songs can associate or link with other data in sibling datasets such as audio features, artist data, song tags, play count, lyrics, etc. The MSD is a cluster of complementary dataset contributed by Second Hand Songs dataset, MusicXmatch dataset, Last.fm dataset and Taste profile subset.

3 Research methodology

The purpose of this report is to figure out applications of user feedbacks, artificial intelligence and data mining techniques on CF recommendation systems. This is done by examining the published articles and affording the community of researchers and practitioners with insight and future direction on CF recommender systems. Hence, this study provides an academic database of the literature between the periods of 2010–2016 by covering 36 journals and proposes a classification scheme according to recommendation databases, user feedbacks which reflected users’ activities and intelligence computing and data mining algorithms to classify the published articles.

In this study, the ways to improve the user feedbacks and an overview of state-of-the-art techniques in recommender systems, particularly the scarcity of historical record of users’ activities applied from public datasets, will be discussed. Research articles associated with the descriptors “collaborative filtering,” “sparsity problem” and “available recommendation data-sets” were carefully selected as follows:

  1. i.

    Step 1: Identification of academic databases

In order to present a comprehensive bibliography of articles on collaborative recommendation systems, these articles were searched from academic databases including IEEE Xplore, Sage, ScienceDirect and ACM Library.

  1. ii.

    Step 2: Preliminary screening of research articles

The search was first conducted based on eight descriptors: “collaborative filtering,” “sparsity problem” and available recommendation datasets including “Jester,” “MovieLens,” “Netflix,” “Delicious,” “Book-Crossing” and “Million Song datasets.” These datasets have been chosen, because they are available datasets on Web and the most used datasets by developers and researchers in CF technique domain.

  1. iii.

    Step 3: Result filtering for articles

The research papers were selected as references if they had satisfied the following criteria: (1) Publication time: Only the papers that had been published between 2010 and 2016 were selected. This was to ensure that the data gathered were still fresh and up to date. (2) Published in high-quality publication: To gain the highest level of related articles, only articles published by academic journals were selected because they were reliable and worthy of comment, since they were peer-reviewed publication. Thus, conference articles, textbooks, unpublished articles, master and doctoral dissertations, non-English papers, and notes were excluded in this study. Following this filtering process, selected research papers were used as the preliminary references for this study.

  1. iv

    Step 4: Research paper selection

Lastly, the full text of each publication and articles that were not related to collaborative recommender systems in solving the sparsity problem were omitted. Ultimately, a total of 131 articles from 36 journals were selected as the final reference list for this research. Figure 1 depicts the research methodology of this research review and the defined criteria for conducting the review.

Fig. 1
figure 1

Research methodology of research review

Fig. 2
figure 2

Classification scheme

This study provides a road map for future work direction on recommender systems research by classifying the academic database of literature. It facilitates accumulated and derived knowledge on the application of user feedbacks, artificial intelligence and data mining techniques in CF recommendation systems.

4 Proposing a classification method

The reviewed articles were categorized into six categories of recommendation databases, two users’ feedbacks which reflected users’ activities and three main categories of techniques used in improving CF. The overall classification scheme for collaborative recommendation articles is presented in Fig. 2. To the best of our knowledge, there has been no research work conducted to comprehensively review collaborative recommendation articles by considering employed public datasets, users’ feedbacks and techniques used in recommender systems.

4.1 Analyzed users’ activities from each public database

The selection of input data (such as user and item characteristics, implicit browsing or buying or clicking activities and explicit ratings) for improvement technique in a recommender system plays an important role in ensuring the quality of recommendations. This section aims to investigate how information on user’s interests can be gained by collecting the user activities including the implicit data and explicit ratings. Hence, it is meaningful that each of selected articles is reviewed and classified by considering the public datasets and the user feedbacks with product attributes and user attributes from public datasets are explored. In other words, recommendation techniques are usually compared on the base of publicly available dataset of movies, songs, jokes, Web pages, books that contain the description of user interests on a set of items and input data for recommender systems (Kim and El Saddik 2015; Liu et al. 2013; Hsiao et al. 2014).

Table 2 lists the publicly available datasets (first column of Table), user activities (second column) and user feedback applied (third column) to recommender systems. The public databases have found real-world applications in movie recommendations in MovieLens and Netflix (Feng et al. 2015; Pirasteh et al. 2015), book recommendations in BookCrossing (Li and Chen 2013; Langseth and Nielsen 2015), jokes recommendations in Jester (Berkovsky et al. 2012; Yan et al. 2013; Casino et al. 2015), Web pages recommendations in Delicious (Huang et al. 2014), and music recommendations in Last.fm and Million Song Dataset (Kim and El Saddik 2015; Liu et al. 2013; Hsiao et al. 2014). In fact, improvement of users’ feedback (data captured about the interactions of users with an item) in recommender systems can have a profound impact on improving CF technique, since it helps in understanding and predicting users’ interests (Hsiao et al. 2014). Accordingly, the research papers with attention to questioning how explicit and implicit user feedbacks studied can be an efficient way to improve recommendation technologies as described in the fourth column of Table 2. Through observing user’ behaviors (explicit (user rating) or implicit), recommender systems can infer users’ preferences to indicate which products they probably like and thus select to purchase.

It is noteworthy that the algorithms are the main components of recommendation technologies in employing various types of input data such as demographic information (age, salary, gender, education, etc.), production data (actor, topic, release time, etc.), and user–item interactions (such as explicit ratings, scores, and implicit comments, search, click times, purchasing data, etc.) as input in recommender systems to predict user interests. As shown in Table 2, many research papers rely on explicit feedback, which are the most convenient in modeling users’ interest for items and improve user feedback with incorporation of the additional information to the rating values. Examples of this additional information include content-based information (Wu et al. 2014; Kaššák et al. 2015), demographic information (Bakshi et al. 2014; Mehta and Banati 2014), explicit trust information (Bellogín et al. 2014), semantic information (Hawalah and Fasli 2014; Moreno et al. 2016) and social information (Mehta and Banati 2014).

The incorporation of such additional information to ratings has proved to be successful in dealing with sparsity problem. However, explicit ratings and additional external information are not always available and require user effort and additional manual labor. For example, Delicious collects bookmarks for each URL as there are no explicit ratings. Therefore, in such a case, the recommender system can derive user’ interests from the implicit user feedback, which indirectly indicates interests of user by observing user behavior (Yakut and Polat 2012; Tyagi and Bharadwaj 2013; Peng et al. 2016).

Table 2 Distribution of articles by public databases and extracted users’ activities

In many research papers (Hawalah and Fasli 2014; Geng et al. 2015), the explicit values are simplified as 1 and 0 which is a typical technique for implicit data to reflect whether the user like the item or not. Peng et al. (2016) and Liu et al. (2013) have applied a factorization machine technique to incorporate explicit user ratings to implicit data by converting ratings in auxiliary data via removing the ratings to improve user–item matrix. However, some research works (Peng et al. 2016; Huang et al. 2015; Yan et al. 2013; Liu et al. 2013; Yakut and Polat 2012) have considered implicit and explicit feedback jointly to achieve useful recommendations so that they suppose users’ ratings as implicit feedback on user purchase. For example, Yakut and Polat (2012) converted the user ratings to binary ones (as rating value greater than or equal to 4 is assigned to 1 and 0 otherwise) to show users have watched movie or not. Huang et al. (2015) have employed both explicit ratings and implicit play counts on artists to cluster items and users into multiple clusters in order to extract user–item groups for making recommendation. As a result, information about user activities and interactions between users and items (see Table 2) is analyzed as input for recommendation algorithms in order to compare the ways of enhancing the extraction and application of user feedback in recommender systems.

A recommender system is based on the relationship between users and items from users’ interaction records on items (such as explicit ratings, scores, and implicit comments, search, click times, purchasing data, the link of Webs, etc.). To determine the patterns captured and understand the relationship between users and items from users’ interaction records, many models and ways have been proposed during the past 5 years (see Table 2). According to our classification method, these models and ways have been classified into explicit and implicit user feedback and indicate the user preferences evidence (tags, link relations, personal user data, social contacts, user-contributed content and user–item interaction data). Shang et al. (2010) and Liu et al. (2013) have suggested that implicit feedback may be more accurate than explicit feedback in reflecting the user’ preferences. Other researchers (Cheng and Wang 2014; Hsiao et al. 2014) have suggested that in certain systems, implicit user feedback can be more reliable than explicit user feedback.

4.2 Overview of algorithms used in improving CF and classification of articles

Several techniques to improve CF have been developed, which are clustering (Bilge and Polat 2013; Moradi et al. 2015), classification (Da Costa and Manzato 2016; Kim and El Saddik 2015), artificial neural network (ANN) (Devi and Venkatesh 2013), particle swarm optimization (PSO) (Bakshi et al. 2014; Tyagi and Bharadwaj 2013), support vector machine (SVM) (Ghazarian and Nematbakhsh 2015), evolutionary computing techniques (Mehta and Banati 2014; Lu et al. 2015), link analysis (Feng et al. 2015; Zeng et al. 2011), regression, matrix factorization (Pirasteh et al. 2015; Pan and Yang 2013), and they are listed in Table 3.

The distribution of one hundred and thirty-one (131) papers classified according to public datasets and used techniques in recommender systems is represented in Table 3, and then techniques employed in making recommendations are summarized. It is significant to note that classifying the research works into techniques according to proposed classification scheme has been comprehensively conducted by doing the review of recommendation system papers published by Park et al. (2012), Lu et al. (2015) and Bobadilla et al. (2013a, b). It is meaningful to analyze and investigate published works to employ public recommendation databases through techniques used in improving CF. The aim of Table 3 is to support and motivate practitioners and researchers by providing the state-of-the-art knowledge on public datasets and providing guidelines on how to implement and validate recommender systems under different domains to support users in various decision activities. Table 3 helps to understand how to improve CF with benchmarks and standard datasets.

Table 3 Distribution of articles by public databases and techniques used in improving CF

Table 3 classifies 131 published works according to artificial intelligence and data mining techniques including classification, link analysis, association rule mining, evolutionary computing, regression, matrix factorization, context awareness-based and content-based, mathematical technique, clustering and fuzzy set. The descriptions of aforementioned techniques are as follows:

4.2.1 Intelligence computing algorithms in improving CF

Evolutionary computation technique is a sub-area of artificial intelligence that can be defined by the type of algorithms it deals with. The most employed intelligence computing algorithms in solving sparsity rating data are as follows:

i.:

Particle swarm optimization (PSO): Particle swarm optimization (PSO) is initialized with a group of random particles or solutions and then searches for optimal solution by updating these particles. PSO finds the global optimum with rough initial condition setting. Since it requires only primitive mathematical operators, it is computationally inexpensive by memory and speed requirement (Bakshi et al. 2014; Tyagi and Bharadwaj 2013). PSO technique finds local neighbors (users who have co-rated items) and global neighbors (users are connected via local neighbors and have transitive similarities) for making prediction on unrated items (Bakshi et al. 2014).

ii.:

Ant colony: Ant colony is a probabilistic technique to solve the computational problems and reduce computational problems by finding good paths through graphs. Metaphors in ant colony help to select the most optimal path in the user interface graph. In CF-based recommender systems, the best neighborhood is selected based on biological metaphor of ant colonies to make recommendations for active user (Bedi and Sharma 2012).

iii.:

Genetic: Genetic algorithms are stochastic search techniques for solving optimization problems with an objective function that is subjected to soft and hard constraints (Lu et al. 2015). They have mainly been employed in two aspects of collaborative recommender systems: clustering and hybrid user models (Mehta and Banati 2014). Genetic algorithm uses parameters to represent chromosomes. Each parameter is encoded as a genetic variable. Therefore, a parameter is optimized together with the chromosome containing many genes. The population is thus a structure established with a certain number of genes together with the variables associated with these genes. A fitness function is used to assess the goodness of an individual solution, and the next generation is formed by the progeny cells of the chromo generated during the crossover process. The chromo with the highest fitness is selected as the parent to provide offspring in the next generation. This process is repeated until certain fitness has been achieved or until a particular number of offsprings have been produced (Lu et al. 2015; Lv et al. 2015). Bobadilla et al. (2011) improve the calculation of similarity between users in CF using a genetic algorithm. They have presented a metric to utilize the combination of values computed between users in similarity stage and weights calculated by genetic algorithm.

iv.:

Immune Network: Immune network technique simulates the mechanism of a biological immune system fighting foreign pathogens. It can be provided as computational systems inspired by theoretical immunology and observed immune functions, models and principles which are applied in problem solving. Chen et al. (2015) combine CF with immune network to solve sparsity problem via treating the rating data as antigens. Then, antigens are copied as the antibodies of the immune networks to generate number of immune networks for finding the nearest neighbors in CF for an active user or item (Chen et al. 2015; Geng et al. 2015).

4.2.2 Machine learning algorithms in improving CF

To overcome sparsity problem, some recommender system researches adopt data mining techniques to provide possible connections between users and items based on retrieving either explicitly or implicitly user preferences and obtain the most efficient results. In this subsection, a brief review of literature which is related to data mining techniques in solving sparsity problem is presented. Data mining is known as knowledge discovery tools to find hidden, new knowledge or unexpected patterns from databases or Web sites.

i.:

Matrix factorization: Matrix factorization is one of the most successful methods which are highly scalable and accurate in reducing sparsity problem in CF. It transforms both items and users into the same latent factor space. Moreover, each entry is specified by a feature vector to be inferred from the existing ratings. Then, the unknown rates are predicted using the inner products of the corresponding vector pairs (Xu and Yin 2015; Pirasteh et al. 2015; Pan and Yang 2013). Xu and Yin (2015) apply matrix factorization to the user similarity matrix in which the corresponding latent features by existing ratings of each user on items are defined to predict unrated items and improve accuracy of CF technique.

ii.:

Association rule mining: The association rules mining technique has also been applied to represent users’ interests in various fields for providing recommendation models. This is due to its ability to scale to large datasets and achieve high accurate recommendations (Tyagi and Bharadwaj 2013). Association rule mining discovers the interesting association relationships (known as rules) hidden in databases which are above user-specified minimum confidence and minimum support levels. These specified minimum confidence and support define how strong the association rules are and how likely the rules are to occur again. Selected rules can form a model for predicting the future interests of a user. In other words, the extracted association rules predict the presence of an item via the occurrences of other items in a transaction where each transaction is user actions to include a set of items. Association rules can eliminate the dependency of CF on the users’ co-rated items by discovering the hidden connections between users and items from users’ past behaviors (Tyagi and Bharadwaj 2013). Support and confidence measures are two important concepts for evaluating the rules in the association rules mining technique. The support and confidence of an association rule \(X\rightarrow Y\) is defined by Eqs. (1) and (2), and only the rules having the following support and confidence values are selected as the useful rules (Tyagi and Bharadwaj 2013)

$$\begin{aligned}&\hbox {Support}\left( {\hbox {X}\rightarrow \hbox {Y}} \right) \\&\quad =\frac{\mathrm {number~of~transactions~which~contain~X~and~Y}}{\mathrm {number~of~all~transactions~in~the~database}}\\&\mathrm {Confidence}\left( {\hbox {X}\rightarrow \hbox {Y}} \right) \\&\quad =\frac{\mathrm {number~of~transactions~which~contain~X~and~Y}}{\mathrm {number~of~transactions~which~contain~X}} \end{aligned}$$
iii.:

Forecasting: Forecasting technique predicts the future behaviors of users based on their past record patterns. It deals with continuously valued outcomes to shape the logical relationships among users in predicting their interests. Artificial neural network (ANN) is a known model for forecasting (a parallel distributed information processing system). This system learns and self-organizes a large number of uncomplicated processing entities which are interconnected to form a network that conducts complex computational tasks (Devi and Venkatesh 2013; Xie et al. 2014). This technique focuses on insufficient ratings to obtain the required rating prediction in CF for decision making to users (Ramezani et al. 2014).

iv.:

K-nearest neighbor (K-NN): A common basic CF technique used in recommender systems to predict future behavior of active user based on interests of users who share similar interests with the active user. This technique ascertains similar users who have previously exhibited similar preferences in order to provide recommendations (Zhu et al. 2011; Hostler et al. 2012).

v.:

Support vector machine (SVM): SVM is an intelligent data analysis technique in classification to find a linear hyperplane (decision boundary) that splits the data in such a way that the margin is maximized. For instance, there are many possible boundary lines that can separate a two-class separation problem in two dimensions where each boundary has an associated margin. The rationale behind SVM is that if the one that maximizes the margin is chosen, it is less likely that the unknown items can be categorized. Hence, SVM is a pattern analysis to find and analyze the general types of relations (for example, rankings, classifications, clusters, correlations, principal components) in datasets to compute similarities between pairs of users (Ghazarian and Nematbakhsh 2015). SVM is a function learning algorithm which learns the provided function from input data in the best manner. It tries to find f(x) function that approximates the relations between data points. The pairs of input data are as follows: \(\{(\hbox {x}_{1}, \hbox {y}_{1}), \ldots , (\hbox {x}_{\mathrm{i}}, \hbox {y}_{\mathrm{i}})\}\) (Yu and Kim 2012). In linear function, the relationships between input and output data are linear. The function is computed as follows:

$$\begin{aligned} f\left( x \right) =wx+b, \end{aligned}$$

where \(w\in X, X\) shows the input space and b shows a real value.

vi.:

Regression: Regression analysis technique uses a linear relationship to provide a connection among two or more variables systematically. It is a diversified process and powerful in analyzing associative relationships between dependent variable and one or more independent variables. Uses of regression involve making prediction, curve fitting and testing systematic hypotheses about relationships between variables. The curve can be useful to identify a trend within dataset, whether it is linear, parabolic or of other forms (Adomavicius and Zhang 2012). Adomavicius and Zhang (2012) use the linear regression-based models to find the connections among rating data characteristics based on three groups: rating space, rating value and rating frequency distribution to improve recommendation quality in CF. The equation is linear as shown below (Liu et al. 2016):

$$\begin{aligned} Y=a+bX, \end{aligned}$$

where Y represents the dependent variable (variable that goes on the Y-axis), a represents the y-intercept, X is the independent variable (i.e., it is plotted on the X-axis), and b is the slope of the line.

vii. :

Link analysis: Link analysis can effectively deal with sparsity problem by exploring trends and pattern by building up networks of interconnected objects (users or items). Link analysis finds the associations between objects in a database. It has presented great potentials in enhancing the performance of Web search. Social network analysis is one type of link analysis technique to discover a fundamental social structure via analyzing the patterns relationships and interactions between social actors. Social network analysis provides recommendation by employing users’ social interaction (such as making social comments, social tags, online friending, etc.). Link analysis can also employ graph-based techniques or graph structure to obtain information on relations between users and items, as nodes in graph are users and items and the links between nodes are user–item interactions to show interest of user. Most of link analysis algorithms utilize a single node in the Web graph to handle a Web page (Li and Chen 2013; Park et al. 2012). The feedbacks and transactions are formed as links connecting the nodes between the two sets. The intuitions behind bipartite graphs are explained by following example. Suppose the user–item interaction matrix is modeled as a bipartite graph shown in Fig. 3 where two sets of nodes are items and users, and an edge connects user X to item P4 if there is transaction done by X on item P4 (for example, item P4 has been purchased by user X) (Feng et al. 2015).

In addition, trust-based recommender systems (trust network) utilize a social network augmented with trust ratings, known as link analysis, to provide recommendations for users based on people they trust. A trust network is initialized with a directed graph in which the nodes are users and the edges are weighted according to the degree of trust assigned by one user to another. Semantic-based recommender systems exploit the underlying semantic properties and attributes associated with users and items to provide recommendations (Lu et al. 2015).

viii.:

Fuzzy set-based: Fuzzy set theory offers a broad spectrum of techniques for the management of non-stochastic ambiguity. Fuzzy set theory is efficient in handling imprecise information, the unsharpness of classes of situations or objects, and the steadiness of users’ profiles (Zhang et al. 2013; Lu et al. 2015). In paper published by Zhang et al. (2013), an item in a collaborative recommender system was represented as a fuzzy set over an assertion set. The value of an attribute for an item is a fuzzy set over the subset of the assertions relevant to the feature. The user’s interests are represented as the basic interest module that can evaluate items. The user’s extensional interests are expressed as a fuzzy set over the user’s experienced items whose membership degrees are the ratings. Based on the representation, the user’s interests for an item can then be deduced (Lu et al. 2015; Cheng and Wang 2014; Anand and Mampilli 2014). Anand and Mampilli (2014) have proposed a fuzzy set theory in which item profile enriched by mining tags from genre of movies and user preferences on features of movie such as actors and directors are combined to improve recommendations in CF.

ix.:

Bayesian networks: Bayesian networks are probabilistic graphical models which use probability to represent uncertainty about the relationships gathered from the data. These models are based on the definition of conditional probability and Bayes theorem. In addition, the concept of prior probability which is used in these networks serves very crucial aspect in classifying. This is because prior probability represents our expectations or our acquired knowledge about what the true relationship might be. In particular, the probability of a model given the data (posterior) is proportional to the product of the likelihood times the prior probability (the prior). The probability component includes the effect of the data, while the prior specifies the belief in the model before the data are observed. When a Bayesian network is implemented in a collaborative recommender system, each node shows an item and the states correspond to each possible vote value. In the network, there will be a set of parent items for each item which show its best predictors (Lu et al. 2015).

Bayesian network also known as probabilistic networks and belief networks to explore the relationships among users and items for predicting the user preferences based on these relationships to improve CF (Langseth and Nielsen 2012). Probability is the most common task to be solved using Bayesian networks. Suppose that grass can be wet by two events: by raining or sprinkler is on. A Bayesian network can model the mentioned situation for computing the posterior probability of each explanation (where 1 (for true) and 0 (for false) (De Campos et al. 2010; Liu et al. 2013)).

$$\begin{aligned} P_r \left( {S=1|W=1} \right)= & {} \frac{P_{r} \left( {S=1,W=1} \right) }{P_{r} \left( {W=1} \right) }\\= & {} \frac{\mathop \sum \nolimits _{c,r} P_{r} \left( {C=c,S=1,R=r,W=1} \right) }{P_{r} \left( {W=1} \right) }\\= & {} \frac{0.2781}{0.6471}=0.43\\ P_r \left( {R=1|W=1} \right)= & {} \frac{P_{r} \left( {R=1,W=1} \right) }{P_{r} \left( {W=1} \right) }\\= & {} \frac{\mathop \sum \nolimits _{c,s} P_{r} \left( {C=c,S=s,R=1,W=1} \right) }{P_{r} \left( {W=1} \right) }\\= & {} \frac{0.4581}{0.6471}=0.708\\ P_r \left( {W=1} \right)= & {} \sum P_r \left( {C=c,S=s,R=r,W=1} \right) \\ {}= & {} 0.6471. \end{aligned}$$

So it can be shown that it is more likely that the grass is wet because it is raining: The likelihood ratio is 0.7079/0.4298 = 1.647

x.:

Clustering: Clustering technique classifies a set of data into a set of sub-clusters in order to find the meaningful groups that exist within them (Park et al. 2012). Once clusters have been formed, the opinions of other users in a cluster can be averaged and used to generate recommendations for individual users. A good clustering technique will provide high-quality clusters where the intra-cluster similarity is high and the inter-cluster similarity is low. Ghazanfar and Prügel-Bennett (2014) have proposed a clustering technique in which a user has partial participation in different clusters and recommendations are based on the average across the clusters of participation that is weighted by degree of participation (Ghazanfar and Prügel-Bennett 2014). K-means and hierarchical clustering are two kinds of clustering techniques used in recommender systems. K-means takes an input parameter and then partitions a set of items into clusters (Ghazanfar and Prügel-Bennett 2014; Shinde and Kulkarni 2012). Hierarchical clustering generates a set of nested cluster organized as a hierarchical tree (Park et al. 2012).

xi.:

Decision tree: This technique classifies specific entities into a set of known classes in the form of a tree structure based on the features of the entities: A root node is the top node, followed by the leaf nodes. Each node is labeled with a question (a single attribute value) to determine which branch of the sub-tree applies and a curve associated with each node or leaf node covers all possible responses (indicate the value of the target attribute) (Ramezani et al. 2014; Park et al. 2012). This technique can improve the calculation of similarities between users or items in CF for improving accuracy of CF recommender systems. For example, Ramezani et al. (2014) create different subspaces of users’ interests on items in order to remove the redundant item subspaces for each user. Then, users who share the same interest patterns on each subspace are defined as neighbor users based on a user’s tree structure.

Fig. 3
figure 3

Example for bipartite graph (Feng et al. 2015)

4.2.3 Other solutions in improving CF

i.:

Context awareness-based, content-based: Recently, many studies have been done in CF that focused on extracting the context information that can be used to characterize the situation of an entity (Formoso et al. 2013; Lu et al. 2015). An entity could be a place, a person or an object that is considered relevant to the interaction between a user and an application, including the user and the application themselves. The contextual information captures the additional information for recommendation making, especially for some applications in which it is not sufficient to consider only users and items, such as recommending the user’s preferences under different conditions. For example, we like hot drinks in winter, but we prefer cold drinks in summer. This kind of preference relations is called conditional preference which exists in real-life context or situation.

Movahedian and Khayyambashi (2014) improve the accuracy of recommendations by employing subjective assessments assigned to items by users as users assign a low or high rating to the item with reason. The context information such as time, contextual user preferences, geometrical information, ratings knowledge by means of user, the company of other people (friends, families or colleagues for example) or expert opinions has been recently considered in existing recommender systems (Movahedian and Khayyambashi 2014; Ren et al. 2013). Context awareness-based includes semantic models, item and rating classifications to find out noisy preferences (inconsistent among users when they elicit ratings for items), local and global techniques and popularity-based models (Hawalah and Fasli 2014).

Various techniques for solving the sparsity problem have been developed to combine content-based technique with CF (Wu et al. 2014; De Campos et al. 2010). Content-based matches the user profiles with the attributes of items in order to generate predictions on users’ interests and unlike CF ignores the contributions from other users (Wu et al. 2014).

ii.:

Mathematical techniques: proposing similarity measure or prediction measure: Neighborhood formation is an important part of CF, which forms neighborhood of an entity (finding similar users to an active user or similar items to the candidate item) by using the traditional similarity measure, i.e., Pearson’s correlation coefficient or cosine-based similarity measure (Najafabadi et al. 2017). When user preferences are very sparse, traditional similarity measures that utilize ratings of only co-rated items might end up with unreliable neighborhoods of an entity. To date, researchers propose a similarity measure for neighborhood formation or predicting the preference of the active user in order to improve basic CF. Some research papers (Gan and Jiang 2013; Patra et al. 2015) propose a new similarity function instead of basic similarity function in CF to calculate similarity among users. This similarity function can improve neighbor formation in CF (specify neighbors of an active user) or enhance the accuracy of predictions by proposing a new prediction measurement (Hernando et al. 2013; Kim et al. 2011) to provide value about how much active user will like item.

Fig. 4
figure 4

Distribution of research articles by public datasets

Fig. 5
figure 5

Distribution of research articles by publication year

Fig. 6
figure 6

Distribution of articles by techniques used in recommender systems

4.3 Distribution of articles by public databases and publication years

One hundred and thirty-one articles from 36 journals were selected and classified according to proposed classification scheme (see Fig. 2). The details are shown in Figs. 4 and 5. Distribution of articles by public databases is represented in Fig. 4. It is apparent that the majority of the recommender system researches employed the MovieLens (64 out of 131 articles, or 49%) to construct movie recommendations. MovieLens dataset is the standard benchmark dataset that has less sparse data provided by GroupLens Research. Thus, the preprocessing data in this dataset are easier to use and the subsequent analysis and evaluation for assessing the quality and efficiency of recommendation technique proposed can be done without much problem. One of the other popular datasets used in research works is Netflix; this database has been frequently used with MovieLens dataset as the experimental data (29 out of 131 articles, or 22%). Some recommendation techniques have been proposed and evaluated by using both application platforms including Jester and MovieLens datasets (12 out of 131 articles, or 9%).

Figure 4 shows that although many articles were published in recommendation fields, only several of them used Delicious (4 out of 131 articles, or 3%), BookCrossing (1 out of 131 articles or 0.76%) and Million Song Dataset (MSD) and Last.fm (8 out of 131 articles, or 6%) as the experimental datasets. Less than 4% (only 5 out of 131 articles) of the research works has taken the experimental data from Last.fm along with MovieLens dataset. Therefore, it looks to be necessary to implement and evaluate new techniques proposed in other fields except movie. It is noted that research works which use more than two benchmark recommender system datasets for developing new techniques and comparing them with other recommendation techniques have been classified in different types of datasets.

The distribution of published research articles between 2010 and 2016 (the first quarter of 2016) is shown in Fig. 5. It is clear that publications which are related to CF research have steadily increased between 2010 and 2013. They continue to rapidly increase between 2013 and 2015. The decrease in articles in 2016 is probably due to the progress of the research and the write-ups. Based on the results of previous publication rates, it expects that interest in CF research will grow significantly in the future.

4.4 Distribution of articles by techniques used in improving CF

The details about techniques used in improving CF are described in Sect. 4.2. It is noted that hybrid recommendation technique has been developed by combining more than one technique presented in classification scheme (see Fig. 2). Articles that employ more than one technique have been classified in hybrid techniques. In recent years, it has been proven that a single algorithm is not generally able to minimize the shortcomings in using basic CF and optimize recommendation accuracy. This explains why researchers have conducted a number of successful recommender systems in which to employ hybrid techniques (20 out of 131 research articles) as shown in Fig. 6.

As shown in Fig. 6, most of researches conducted in CF have focused on context awareness-based in improving accuracy of CF (29 out of 131 research article). This is due to the fact that contextual information and content-based capture the additional information sources beyond the user–item matrix (such as time, tags, comments) to enrich user profiles in providing the appropriate recommendation. Researchers have also conducted several studies on proposing a new similar function to identify users who are similar to the active users or present a new prediction measurement to provide value about how well active users will like the item that has been classified in mathematical techniques (16 out of 131 research articles). It is noted that matrix factorization is the main research focus of current CF research (17 out of 131 research articles). This has been proven that that matrix factorization technique can address CF problems with highly accurate and scalable result in most of the application fields. It is apparent that recently the developed advanced techniques such as link analysis, regression and fuzzy set-based are successful techniques that are widely used in recommender systems today.

Association rule mining is one of the most successful techniques that researchers combine with other techniques including particle swarm optimization (PSO) and link analysis to alleviate the sparsity problem in CF. However, few researches have considered developing recommender systems by association rule mining; thus, there are still some issues on association rule mining that need to be addressed in the light of the emerging recommendation systems. In addition, several efforts have been proposed that can facilitate better handling of the challenges in CF by employing clustering (12 out of 131 research articles). Heuristic methods have been developed by adding new method to existing methods. Classification algorithms include K-NN, SVM, forecasting and decision tree that have been developed by the few numbers of referred papers (6 out of 131 research articles).

As a consequence, a recommendation system is based on three basic kinds of entities including items (e.g., music, news, books and movies), users and user–item historical records (e.g., tags, comments, scores). The main task is to determine the useful patterns in describing the association among users and items which are from the user–item historical records. Then, predictions are made for possible user–item links based on patterns. To accomplish this task, many techniques or algorithms were developed during the past years. In general, the existing research works can be further divided into three categories which can effectively deal with data sparseness problem in CF. One emphasizes the usage of item-specific contents, such as link relations, comments, tags; the other research works focus exploiting user-specific information such as the trust relationship between users. The third category uses mechanism to employ pure rating data to find “neighbors” of users and make predictions. However, these research papers have used either user’s social information or item contents and few of them have considered them jointly.

5 Research implications

The findings represented in this paper have several significant implications as follows:

  • The findings have showed that even though research done in CF has obtained great development in different application fields, there are still some issues on music, book, joke and document recommendation systems that require further research especially with the emerging of new recommender system applications. MovieLens has been subjected to significant research on CF, since this dataset is known as the best known example for common users and the use of this dataset is easy. Therefore, in order to fill this gap, more researchers are required to use datasets in other application fields instead of movie.

  • Based on reviews on academic research papers and issues gathered on CF research, it is obvious that a good mechanism to improve user’ preference matrix and select a set of “neighbors” of each user is very significant. Thus, instead of using Pearson and cosine metrics, researchers have employed a better way by considering data mining and artificial intelligence techniques to select “neighbors” of users for CF in order to facilitate better handling of the challenging problem of CF (data sparseness problem).

  • Two important features of this research which clearly distinguish it from other review articles in CF area are: (1) It targets and focuses on the public application platform of recommender systems and (2) it systematically investigates the research articles through three dimensions: (i) techniques used (including classification, association rule, link analysis, evolutionary computing, regression, matrix factorization, context awareness-based, content-based, mathematical technique, clustering, fuzzy set-based), (ii) benchmark recommendation databases and (iii) two user feedbacks which reflected users’ activities (implicit and explicit feedbacks).

  • Research works using practical solutions to derive user’ interests from their implicit behavior are growing every year. Unfortunately, few research works have been published to grasp user interests from social networking activities such as tagging and music listening information of the users to boost the recommendation. Hence, researchers are driven to develop effective techniques for dealing with such implicit data.

6 Conclusion and future work

This paper is aimed at providing the descriptions and comparison on public recommendation datasets from different domains in choosing the suitable dataset to analyze and investigate users’ activities which can influence the recommender system developed based on CF technique. Sources of users’ interest evidence on user–item interaction data such as tags, implicit comments, users’ clicking, explicit ratings, interaction record and social contacts are more effective to be used in achieving useful recommendations. A critical analysis was conducted on existing research articles which have employed CF. A hundred and thirty-one research articles have been selected, and they were published between 2010 and 2016. These articles were employed to analyze public recommendation datasets with various types of user preferences about items or resources belonging to domains including Web pages, movies, jokes, books and music tracks. The purpose of this study was to understand applications of user feedbacks, artificial intelligence and data mining techniques on CF recommendation systems by examining the published articles and to afford the community of researchers and practitioners with insight and future direction on CF recommender systems. Hence, this study provides an academic database of the literature between the periods of 2010–2016 covering 36 journals and proposes a classification scheme according to recommendation databases, user feedbacks which reflected users’ activities and artificial intelligence and data mining techniques to classify the published articles.