Keywords

1 Introduction

In the last decade, recommendation systems have assumed an important role in online social media, e-commerce, and entertainment platform such as LinkedIn, YouTube, Research Gate, etc. [1]. Earlier, it was very difficult to find a suitable and efficient recommendation for the users. With the development of technology, recommender systems have grown exponentially in various fields of information and web applications. The number of datasets is available pertaining to the recommender systems which facilitates in generating and different preferences.

Recommender system is accumulation of effective tools that can be used for recommending future preferences of a set of products to consumers and appropriately predict the most probable items [2]. The recommender system consists of various approaches such as content-based filtering, collaborate-based filtering, and hybrid filtering. Collaborative Filtering (CF) plays a vital role in recommendation systems which can help to make recommendations based on users’ interests and preferences by using the previous history. Most developers tend to utilize collaborative filtering as this technique provides the best preferences. To get efficient and appropriate preferences, different models have been encompassed in CF. Further, a large number of datasets is available that can potentially create data sparsity and scalability issues [3]. To analyze this problem and improve the quality of the data. Machine Learning techniques can be utilized. Further, recommender system and their techniques are discussed to make appropriate preferences as per users’ needs. With the help of such techniques and models, better performance of the recommender system can be attained.

This paper has been fragmented into multiple segments, each of which potentially focuses on different aspects of Recommender System. The paper initiates with a fundamental and rudimentary description of recommender systems. Different types of systems have been thoroughly discussed in the primary section of the paper. Major classifications discussed in the paper include content-based filtering, collaborative-based filtering, and hybrid filtering. Further, the paper also emphasizes on the discussion of two major types of memory-based filtering which includes item-based filtering and user-based filtering. After a comprehensive description of the basic types of recommender systems, analysis and findings of data sparsity algorithms have been thoroughly discussed in the paper. Overall, the research paper provides a thorough and in-depth description about the concept of recommender system and the algorithms used for the purpose of resolution of data sparsity.

2 Related Work

In this section, we will discuss the recommender system and their types are discussed further in detail.

2.1 Recommender System: Classifications

Recommender system has evolved as a revolutionary concept that provides end users with the suggestions of information that would be highly useful to them [4]. The recommender system offers appropriate ways for providing personalized results. This system was predominantly employed in e-commerce and entertainment but nowadays it has grown in the field of research and academics. Fundamentally, recommender systems are those systems that predict the future of any items based upon the past behavior of end-user’s [5]. In these days’ machine learning introduced so many algorithms to predict recommendation as per previous preferences. All the classifications of recommender system are represented in Fig. 1 given and further explained in details.

Fig. 1
figure 1

Types and classifications of recommender system

2.2 Content-Based Filtering

The content-based recommender system is developed on single-user preferences. For example, on e-commerce websites, every individual tends to search according to their interest, and this user history is recorded as their past behavior. Further, the system examines the user’s search history and then recommends similar choices to users [6].

2.3 Collaborating-Based Filtering

Collaborating is built upon users’ historical behavior that including star ratings and reviews. This system used to build a frequent change in user’s preferences. All your previous information is gathered over the internet then the system will make a recommendation based on its analysis. This proves that collaborating filtering secures an important place in recommender systems [1].

Model-based collaborative filtering

This technique is beneficial for calculating the matrix factorization and this technique will be more efficient than memory-based collaborative filtering. Some Machine Learning approaches are included to make accurate predictions. Approaches include associate rule, decision tree, clustering, matrix techniques, etc. [7].

Memory-Based Collaborative Filtering

It entirely works with users’ previous database to make a single prediction. For every single prediction, it consists of a preference database of user-item filtering and item-item filtering. A memory-based collaborating system is beneficial for making similarities between the two, due to the sparsity and scalability issue that comes under this scenario [7].

2.4 Hybrid Filtering

Different approaches are introduced in the recommender system and each of which has its functions and parameters. Still, there are some lags in the recommender system to improve for which hybrid filtering can be potentially used. Hybrid filtering is the merger of both content-based filtering and collaborating filtering and it works on system performance. The hybrid technique is used for resolving issues and enhancing the performance of the recommender system [8].

Althbiti et al. [7] the author has proposed item-based collaborative filtering and made use of the movies dataset to do a comparison between different clustering approaches. With the help of such approaches, the author wants to reduce the unpleasant data to remove the sparsity and scalability issues. A developer proposed a novel recommender system to improve the performance of reliability measures. This method is beneficial for collecting unreliable ratings and evaluating the results of reliability measures. In Anand and Bharadwaj [9], a simple approach is introduced to calculate the statistical classification of users’ ratings and behavior. It uses both user-based and item-based which forms a hybrid approach and generates more accurate predictions that improve the performance time. In Guo et al. [10], a method is proposed to extract the unpleasant data to reduce the sparsity issues. As users tend to buy items based on personalized but improper and ineffective feedback could impact the process. To overcome this, the author introduced two methods: the first one is linear regression and the other is multidimensional similarities to tackle the sparsity data and make it reliable. An author proposed a technique particularly cross-domain and transfer learning to facilitate the similarity between distinct user profiles. Model-based and Memory-based collaborative filtering comes together to deal with the sparsity issue in the recommender system. To improve the performance of the recommender system by targeting certain user preferences to predict accurate results.

The author Zhang et al. [11], identifies that the collaborating filtering is suffering from sparsity issues as the number of products selling rapidly increases sparsity and rebuilding the bipartite graph to improve the accuracy and density of the network in the graph than the original one. Further, the author proposed clustering algorithms to handle the performance and accuracy. In Men et al. [12], approaches are introduced to check sparsity under different scenarios and compare those approaches at different levels to enhance input. The long short-term memory algorithm is introduced to analyze the performance time under the same circumstances. Those items which are preferable to sparse values are eliminated from the recommender system to get more accurate recommendations. In Sharifi et al. [13], the author proposed a new algorithm to handle the sparsity issue by using non-negative matrix factorization to predict better results as compared to real data. a method is proposed to modify the collaborative based recommender system with the help of matrix factorization. This method proposed the incorporation Based recommender system to improve the issue of sparsity.

As rating data is not appropriate, using such an algorithm improves the prediction and provides accurate values to each. This will improve the best accuracy and recommendations.

3 Data Sparsity Resolution Algorithm in Recommender Systems and Their Analysis and Findings

In this section, we will compare algorithms which are used in the resolution of data sparsity. Discuss the algorithms used by different paper and then compare their results based on Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). This will help to overcome the sparsity issues in future research work and Figure out among all of the algorithms used by different papers which gave results. All algorithms and their results are discussing Table 1.

Table 1 Comparison of algorithms

A sample of research is completed on the sparsity issue, to reduce the issue researchers propose so many learning algorithms such as SVD, k-means, ANN. Still, our research is not completing each and every paper of sparsity in the recommender system. Above table demonstrate the analysis of the algorithms to overcome the sparsity issue. With the help of this techniques 80% of sparsity issues are removed but there is still 20% sparse data which can potentially affect the recommendation process. Based on the evaluation of different studies, appropriate comparisons have been made between various algorithms. Each algorithm has been explained with their results but it has been observed that singular value decomposition plus plus (SVD++) results provide best accuracy among all the techniques. This approach gives less error in terms of Root Mean Square Error (RMSE)—0.92 and Mean Absolute Error (MAE) is 0.72 [13].

4 Conclusion

To summary, this paper has focused on conducting a thorough and rigorous analysis and evaluation of multivariate research papers. The primary objective of this paper has been to understand the potential trends pertaining to sparsity-related investigations and gain a strong understanding of key solutions that have been proposed by different authors. This work is an output of a broad search strategy which has been executed on a complete range of platforms to gather succinct information for performing the necessary analysis and evaluation. The study has heavily focused on understanding different algorithms proposed by a wide range of authors. In future research the algorithms can be deployed to make comparison between algorithm in case of data sparsity issue, So, that data sparsity issue get resolved. For further research, a more practical approach can be employed to gain a strong and in-depth understanding regarding the working of various algorithms to propose effective solutions.