1 Introduction

Deep learning is a specific kind of Machine learning which utilizes Artificial Neural Networks. In recent years, it has gained significant attention of researchers and practitioners in data science, analytics, and Big data. It employs hierarchical abstraction layers to improve the training of data. Input layer, hidden layers, and output layer are combined to provide high accuracy which could not be imagined with traditional data analytics. Deep learning is popular due to its efficient approaches for image processing, natural language processing and speech recognition [17, 28]. Several research works have focused on incorporating deep learning for big data analytics. Specifically, recommender systems have gained advantage from deep learning to a large extent.

Recommender system is of utmost importance for organizations in the era of social media. Large-scale complex unstructured Big data result in information overload. Recommender systems suggest users with books, products, audio, video and research topics, etc. by intelligent information filtering. It is the reason that recommender systems are gaining huge popularity amongst users in recent years. Collaborative filtering, content-based and hybrid based are various types of recommender system [15]. Sparsity and cold start are issues in traditional recommender systems. Several researchers have improved recommender systems and tried to overcome these issues by employing tags [31], social details [4, 11, 16] and contents [32, 34]. In this paper, these issues are alleviated by using autoencoder and indirect trust.

User and item latent representations are essential for providing a better recommendation which can be extracted efficiently by using deep learning. It has already been pursued by researchers [21, 26]. Deep learning helps recommender systems to learn user and item features by using hierarchical abstraction layers. Recommender system leverages the hidden layers in neural networks which learn user behavior and item features. Linear models are not able to learn user features from sparse matrices due to shallow model learning. Deep learning can improve recommender systems due to its ability to exploit non-trivial structure of user and item attributes [40].

Social information can improve recommendation accuracy significantly [37]. Several research works have included social trust in recommender system. In this paper, trust values of directly trusted users and indirectly trusted users (friends-of-friends) are improved by our proposed approach. Many deep learning models use one feature only, i.e. user trust or item attributes in hidden layers. In our proposed approach, trust values amongst users and also user-item ratings are fed in neural networks through shared correlated layer.

Autoencoder is used in our proposed approach to improve recommendation accuracy. Deep autoencoder is composed of encoder layer, hidden layer, and decoder layer. User-user trust and user-item ratings values are encoded in the encoder layer. In hidden layer, recommendation accuracy is computed by using our proposed approach. Output is reconstructed in decoder layer. Novel approach for neural networks is proposed in this paper which covers direct and indirect trust values as weights in hidden layers.

User-items matrix values are sparse as there are few ratings provided by users. Similarly, user-user trust values are also sparse as there are few trustee-trusters. Several research works have predicted ratings from missing values through novel approaches. Matrix factorization is based on latent factors which predict ratings from sparse values. Deep autoencoder can further improve rating prediction.

The main contribution of this paper is as follows:

  1. (1)

    Implicit and explicit user and item information is included in our work which enhances recommendation accuracy.

  2. (2)

    Several research works leveraged direct social trust to improve recommender system. The key contribution of this paper is to include social trust (direct and indirect) with deep architecture using autoencoders.

  3. (3)

    Shared layer is proposed in our work, which can incorporate trust as well as rating information and learn from correlated representations.

The rest of the paper is organized as follows. Section 2 covers related work in traditional recommendation, social recommendation, deep learning techniques and its relevance in recommendation. Autoencoder is elaborated in Section 3. Proposed work is framed in Section 4 where recommendation leverages autoencoder accuracy. Experiment analysis is discussed in Section 5. Finally, Section 6 concludes the paper with some key future directions.

2 Related Work

In this paper, various recommendation techniques- content-based, collaborative filtering based, hybrid and social trust based are analyzed in detail. Deep learning techniques, its application in big data analytics specifically recommender system is analyzed and validated in this paper. Deep learning models such as deep belief network, CNN, RBMs, feedforward are applied for specific applications. However, our focus is on AutoEncoder due to its dimensionality reduction and significant accuracy in recommender systems. In this section, state-of-the-art researches are studied and compared based on essential factors.

In [1], hybrid recommender system is explained which combines features of content based as well as collaborative filtering based recommender systems. Fab is proposed in this paper which is a distributed recommender system using hybrid approach. It leverages collaborative filtering feature of ratings and users similarity based on their tastes. It also incorporates the advantage of content based feature of user’s profiles. The distance between the user’s actual rankings and ideal rankings are calculated as an evaluation measure in this work. It is proved that Fab outperforms other sources in terms of personalized recommendation. In [19], social network information of users is included with rating values. It is mentioned in this paper that prediction is not high-quality by traditional recommender systems due to sparsity. Moreover, existing approaches are not able to deal with large-scale datasets. Probabilistic matrix factorization is applied to user-item matrix and social network information of users. In this paper, it is shown by experiment analysis that prediction quality is better as compared to collaborative filtering. [10] proposed TrustSVD which includes trust values with ratings. Matrix factorization is employed to overcome sparsity and cold start issues. In this research, SVD++ is incorporated with the explicit and implicit influence of truster and trusted users. Experiment analysis on FilmTrust, Epinions, Flixster and Ciao datasets proved that Mean Absolute Error and Root Mean Squared Error are improved as compared to other state-of-the-art approaches. In [30], social recommendation is elaborated. Memory-based and model-based recommender systems are described as classification of recommender system. Traditional recommender system is compared with social recommender system. It is concluded in this paper, that users are influenced by their socially connected users. MAE and RMSE are mentioned as evaluation metrics for Recommendation accuracy. In essence, it is mentioned in this paper that social recommendation is based on social connected users by the means of trust or other explicit links. Social recommendation is in the early phase, and a lot of research is required to further improve it. In [42], autoencoder is applied to learn hidden representations of user-item. It is mentioned in this paper that matrix factorization is not able to extract complete information from user-item matrix and produce less relevant results. Gradient descent is also included to learn hidden representation. It is proved that the approach proposed by authors outperforms state-of-the-art matrix factorization techniques. [23] proposed TDAE which is trust based recommendation using denoising stack autoencoder. A novel neural network is proposed which balances trust and ratings representations. It is argued in this paper that trust and ratings representations can predict each other. This is the reason, correlative regularization is employed to model prediction and avoid overfitting. [39] describes deep learning as remarkable machine learning techniques, which has achieved success in understanding of text. Supervised and unsupervised techniques are followed in deep learning to learn representations. It is explained in this paper that Big data due to its high volume, velocity, veracity, and variety, cannot be processed easily. Deep learning plays a significant role in dealing with complex Big data with its hierarchical abstraction learning. In [13], it is mentioned that matrix factorization is the technique which is used to predict user-item ratings. Instead of using matrix factorization, recommender systems can leverage neural networks feature of learning from non-linear implicit feedback.

In Table 1, existing research works are compared with proposed approach AutoTrustRec on the basis of features such as neural network, activation function and technique used. It is clearly shown that existing research works have focused on single layer architecture i.e. only trust, ratings, collaborative filtering or matrix factorization. In proposed approach, AutoTrustRec shared layer of trust and ratings are fed in AutoEncoder to provide better recommendation accuracy.

Table 1 Comparison of existing work and proposed approach

3 Background and Preliminaries

Autoencoder is deep feedforward model which is formed by encoder, hidden and decoder layers. It is unsupervised neural network which is efficient for compressing data and decode, reconstruct output. Noise is removed in the encoder layer and output is reconstructed in the decoder layer. Big data consists of large scale dimensions. It is complicated for current learning models to accommodate large-scale dimensions [25]. Autoencoder uses dimensionality reduction techniques to learn representations from large-scale dimensional data. PCA is also a technique for dimensionality reduction, but the advantage of autoencoder is the ability to process linear and non-linear structured data. It is unsupervised learning based neural network where output is reconstructed from encoder layer [29]. Bottleneck is used to optimize the feature learning and focusing only on essential features. Figure 1 depicts encoder and decoder with dimensionality reduction. It is clear from Fig. 1 that layers are reduced in every step which is result of dimensionality reduction.

Fig. 1
figure 1

Autoencoder architecture

Autoencoder is applied in image processing, speech recognition, and information retrieval. Deep autoencoder [14], Stack denoising autoencoder, and convolutional autoencoder are different types of autoencoder which are most commonly used by researchers. Sparse, marginalized and variational autoencoders are also variants of autoencoder [3, 8]. In autoencoder, Encoder layer is represented by:

$$ \partial =h\left( WX+b\right) $$
(1)

where, h is non linear activation function, W is weight assigned for layer, b is bias. Decoder layer is represented by:

$$ \overline{X}=g\ \left(W\partial +\overline{b}\right) $$
(2)

where, g is non linear activation function, W is weight assigned for layer.

Loss function is represented by:

$$ \mathrm{Loss}=\sum \limits_{\mathrm{i}=1}^{\mathrm{n}}{\left\Vert\ \mathrm{X}-\overline{\mathrm{X}}\right\Vert}^2 $$
(3)

AutoEncoder is assumed to be the most suitable deep learning model to analyze missing entries in sparse data. In recommender system, user-user and user-item matrices entries are sparse. Recommender system is effective only when it can predict ratings for user with accuracy. In Autoencoder, loss function reduces prediction error which is significant contribution of this model in recommender system. Binary entropy and Mean squared error are used as loss function in autoencoder. However, mean square error is utilized in our work.

Fig. 2 demonstrates the concept of direct and indirect trust within social context. If any user A provides recommendation to user B and it is found to be relevant by user B, their trust increases. Hence, user A and user B are directly connected through trust. Several research works leverages direct social trust. Our contribution is that indirectly strongly trusted users can also provide relevant recommendations. For example, User 3 and User 4 are strongly connected and also User 4 and User 5 are connected through trust. Using transitive closure and hyperedge between user 3 and user 5, indirect trust is embedded. If user A - > user B and user B - > user C, using an indirect trust it is concluded that user A - > user C. Threshold is set to 2 edges so that strong trust information is retained in our social graph.

Fig. 2
figure 2

Direct and indirect trust in social graph

Trust values are sparse and several research works have used these values in shallow model [23]. In deep model, it cannot be directly incorporated. Using our proposed approach, trust values are less sparse and can be easily deployed on deep learning model.

In recommender system, user-user trust values are Un ∗ n, and user-item ratings values are Im ∗ n. The predicted rating Rm ∗ n is value which is user to item rating estimation from any approach. The motive for better recommendation is to reduce the gap between predicted ratings values and original ratings values. MAE and RMSE are evaluation metrics which estimate the difference in predicted ratings RP and original ratingsRO.

User provides ratings to products in scale 1–5, based on relevance as demonstrated in Fig. 3. For example, User 1 has provided 3 rating to product 1. There are some users for whom no entries are available. The reason can be there is no interaction between user-item or user is new to the system. These result in sparsity and cold start respectively. The motive of recommendation system is to eliminate these issues. In this proposed work, autoencoder is utilized to alleviate these issues.

Fig. 3
figure 3

User-item interactions

4 Proposed Approach

Several research works have tried to improve recommendation by integrating new techniques for user and item attributes. Several researchers have improved recommendation ranging from matrix factorization which is computationally intensive to deep learning [2, 5, 9, 22, 33]. Neural network architecture can replace inner product of matrix factorization by using arbitrary function [13]. Hybrid recommendation is also innovated by researchers to improve social recommendation. Social information that contains user implicit and explicit feedback, are also exploited in recommender systems [12, 35]. These research works have contributed in improving recommendation accuracy. Deep learning can further improve recommendation accuracy as it can be used for representational learning for sparse and highly complex features.

In previous works based on recommender systems using deep learning, only one type of layer is used from user behavior or item attributes. Only rating is applied in recommender system using deep learning [27]. Social recommendation is proposed by researchers which employ trust values amongst users in providing recommendation to users. However, only ratings or only trust values are not enough to learn representation due to shallow model learning and sparse values. Multiple varieties of inputs are possible to deploy, which has already been proven by researchers [7, 20, 36, 41]. In our proposed work, shared layer is applied on user and item layers which synchronize user-item ratings matrix and user-user trust matrix. Also, previous works have incorporated only direct trust values; in our proposed work, direct and indirect trust values are applied in trust layer.

Trust and ratings values are used as input in encoder and correlated through shared layer.

Trust Encoder : Yu = σ (WT + bu) (4)

Item Encoder : Yi = σ (WR + bi) (5)where Yu is encoded values from trust values T and weights W and added with bias vector bu. Yi is encoded values from ratings values R and weights W added with bias vector bi. σ is activation function used for element mapping. Sigmoid function is used in this proposed work.

$$ \sigma =\frac{1}{1+{e}^{-x}\ } $$
(6)

These encoded values are synchronized to form Ys. It contains only values where user-user trust value pair matches with user-item value.

$$ {Y}_s={Y}_u\leftrightarrow {Y}_i $$
(7)

Activation function can approximate any continuous function, and neural networks can approximate any discontinuous function [6]. In Eq. (7), our formulated function is built by neural network. Decoder layer reconstructs values, and provides output as predicted ratings.

Trust Decoder : Ou = σ (WYu + bu) (8)

Item Decoder : Oi = σ  (WYi + bi) (9)

The main issue with shallow neural networks is that it cannot process large dimensional unstructured data. When these large dimensions are fed to neural networks for learning, computations increases exponentially as clear from Eq. 10.

$$ e={n}^{\frac{-p}{2p+d}} $$
(10)

where n is number of data points, d is dimensions, p is parameters of the model. Autoencoder compresses data and reduces dimensions. It is a very crucial and valuable step in autoencoder model.

Several research works have employed user-item ratings in deep learning models [29, 32]. However, to the best of our knowledge, trust and ratings values are not embedded in deep autoencoder through shared layer. The advantage of using trust values including ratings values is that users will be provided recommendation from their trusted users only. Figure 4 explains shared layer where trust and ratings representations are simultaneously correlated.

Fig. 4
figure 4

Proposed shared layer Autoencoder architecture

In Fig. 4, it is clearly depicted that Ri is ratings and Ti is trust, these are correlated by the use of encoder layer as Yu and Yi respectively. Ratings and trust values are reconstructed in decoder layer as Ou and Oi respectively. Loss function estimates the difference in actual rating and predicted ratings. We have optimized loss function by incorporating trust along with ratings.

$$ L=\kern0.5em \delta {L}_{R\kern0.5em } $$
(11)

where LR is loss due to ratings and δ is trust factor which reduces loss in prediction of rating. Previous research works have added loss due to social information in consolidated loss function. Our observation concludes that improved trust values have loss reducing effect as proved in Experiment section. When the prediction model is trained on implicit as well as explicit information of users and items, loss is reduced significantly. The reason for reduction in prediction error is that ratings are provided by trusted users only.

There are a lot of sparse entries for user-item ratings and user-user trust values. This may result in local optimal for prediction model. We have tried to overcome this issue by incorporating item ratings correlated with trust factor.

$$ \delta =\frac{\sum \limits_{i=0}^n{R}_i}{n}+\varphi\ \left(W,T,{b}_u\right) $$
(12)

Where Ri is ratings provided by users to item i and sum is of trusted users ratings only. It is correlated with regularization of weights and bias values. The advantage of our proposed approach is that ratings are predicted only from trusted users with direct and indirect trust. In next section, our approach is validated using Epinions and FilmTrust datasets.

5 Experiment Analysis

Experiments are conducted on two public datasets which are widely used for social recommendation. Epinions (http://www.trustlet.org/epinions.html) is dataset which contains trust values amongst users i.e. who-trusts-who. User provides trust to other users with value 1. If there is no trust amongst users, then no entry is available in this dataset as distrust is avoided due to the privacy of users. In addition to trust values, ratings are also available in this dataset as mentioned in Table 2.

Table 2 Epinions dataset statistics

Experiments are conducted on Epinions dataset as this dataset is most relevant to our research work. Ratings and trust values of this dataset are correlated in our proposed approach.

FilmTrust (https://www.librec.net/datasets.html#filmtrust) dataset is also used in our research work. It also contains user-item ratings and user-user trust values. Dataset statistics are elaborated in Table 3.

Table 3 FilmTrust dataset statistics

Table 4 describes statistics of Ciao dataset. User-Item ratings and user-user trust values are mentioned in Table 4.

Table 4 Ciao dataset statistics

Several evaluation metrics are used by researchers for predicting recommendation accuracy. Mean Absolute Error and Root Mean Squared Error are the most widely used metrics.

MAE is an average of the difference in predicted rating and actual rating.

$$ \mathrm{MAE}=\sum \limits_{\mathrm{i}=0}^{\mathrm{n}}\left(\mathrm{pr}\left(\mathrm{u},\mathrm{i}\right)-\mathrm{p}\left(\mathrm{u},\mathrm{i}\right)\right)/\mathrm{n} $$
(13)

where pr is predicted rating for user u on item i and p is actual rating for user u on item i. n is number of entries in testing dataset.

RMSE is square root of average of square of difference in predicted rating and actual rating.

$$ \mathrm{RMSE}=\sqrt{\sum \limits_{\mathrm{i}=0}^{\mathrm{n}}\ {\left(\mathrm{pr}\left(\mathrm{u},\mathrm{i}\right)-\mathrm{p}\left(\mathrm{u},\mathrm{i}\right)\right)}^2/\mathrm{n}} $$
(14)

Table 5 shows the result of experiments conducted on our approach with different factors. Only ratings values are used as dataset values, then ratings and direct trust values are used in shared layer and finally ratings, direct and indirect trust are included in shared layer as dataset. Results indicate that social trust has major impact on recommendation accuracy. It performs significantly when trust values are incorporated along with user-item ratings. When it is compared to results of ratings only, MAE and RMSE are improved.

Table 5 MAE and RMSE of proposed approach

Figures 5, 6 and 7 demonstrates MAE and RMSE values on Epinions, FilmTrust and Ciao datasets by using our approach. AutoTrustRec is evaluated using ratings only, direct trust and indirect trust. When it utilizes only ratings values, it is clear that MAE and RMSE values are higher than values when it utilizes ratings and direct trust. This validates that social information enhances recommendation accuracy significantly. Moreover, when experiments are conducted using ratings, direct trust, and indirect trust values, MAE and RMSE values improve. Our proposed approach improves trust amongst users and also provides better recommendation accuracy.

Fig. 5
figure 5

MAE and RMSE on Epinions dataset

Fig. 6
figure 6

MAE and RMSE on FilmTrust dataset

Fig. 7
figure 7

MAE and RMSE on Ciao dataset

Our proposed approach is also compared with [24, 38] and mentioned in Table 6. In [38], social information is extracted from ratings. Collaborative filtering based recommender system along with social information is proposed in [24].

Table 6 MAE and RMSE of proposed approach and existing approaches

Figures 8 and 9 depicts that our proposed approach outperforms social information based recommender system. This is because deep architecture learns non-linear relationship amongst trust and ratings information.

Fig. 8
figure 8

MAE and RMSE on Epinions dataset

Fig. 9
figure 9

MAE and RMSE on FilmTrust dataset

6 Conclusion and Future Directions

Recommender system is essential information filtering tool which overcomes information overload issue due to complex and large-scale data. User-item ratings matrix is sparse and traditional matrix factorization techniques cannot predict ratings with high-quality. In this paper, deep architecture is leveraged to learn hidden user and item representations. Moreover, social information enhances recommendation accuracy due to its user-user trust. In this paper, shared correlated layer is used which incorporates user-item ratings and user-user trust values. Recommendation is improved by using direct as well as indirect trust values as input layer in Autoencoders. Experiment analysis proves that MAE and RMSE are improved significantly as compared to existing approaches. Other deep learning models can be applied on shared layer as future work. Also, activation functions such as tanh and ReLu can be applied and compared for recommendation accuracy.