Keywords

1 Introduction

In the big data [14] era, the ability to select information is fundamental. This achievement is not simple among all data available for a system. Indeed, the number of collected data is enormous. In this field, one of the most important tools able to support users is recommender system (RS) [7, 11, 13].

Recommender systems (RSs) are information filtering tools that give support to users in order to select among all available items [25]. They are able to select the right information about users preferences and items features in order to provide an estimation that allows to classify items between useful or not for users.

The elements of RSs are users of the system, items that have to be suggested and transaction [25], that represent the interaction between a user and the system. The most common transaction form is rating, an implicit or explicit evaluation of the user preference about an item [25]. Moreover, rating can be seen as a function that has as domain the Cartesian product of users set U and items set I.

$$\begin{aligned} r : \left( u,i \right) \in U \times I \mapsto r\left( u,i \right) \in \mathcal {R} \end{aligned}$$
(1)

The principal problem of RSs is to determinate \(r_{ui} = r \left( u,i \right) \) \(\forall \left( u,i \right) \in U\times I\). The mode to achieve rating forecast enables to classify RSs based on the different strategies.

The most common recommendation strategies are content-based, collaborative filtering and hybrid [6].

Content-based RSs [12, 23] are based on the creation of users and items profiles. These profiles are exploited in order to obtain user-item affinity. The most common form exploited to calculate user-item affinity is cosine similarity [17].

Collaborative filtering RSs [31] are based on known ratings provided by users and can be divided in two different classes: memory-based and model-based. Memory-based ones aim to divide users (user-based) or items (item-based) into groups [24]. Model-based ones aim to create a numerical model of the problem through the factorization of ratings matrix [25].

Hybrid RSs exploits the principal features of the previous methods in order to overcome the problems of the single method [10].

The development of RSs brought to the introduction of new elements in order to improve provided rating forecasts. Context [2, 16] is one of these elements and allows to obtain context-aware recommender systems [9]. The awareness of context is exploited in order to obtain more appropriate rating forecasts.

Context can be defined as “any information useful to characterize the situation of an entity that can affect the way users interact with systems” [1, 30] and allows to define a new rating function that has as domain the Cartesian product of users set U, items set I and the contextual sets \(C_1,\dots ,C_n\) that contain the contextual information analyzed by the system.

$$\begin{aligned} r:U\times I \times C_1 \times \dots \times C_n \mapsto \mathcal {R} \end{aligned}$$
(2)

The introduction of context in a recommender system is possible through three different strategies [2, 10]:

  • Contextual Pre-Filtering: the contextual information is analyzed before the recommendation phase in order to select the proper elements to provide to recommender system;

  • Contextual Post-Filtering: the contextual information is exploited in order to select the proper rating forecasts provided by the recommendation phase;

  • Contextual Modeling: contextual information are exploited in the recommendation phase in order to generate appropriate rating forecasts.

Fig. 1
figure 1

Strategies for introducing contextual information into a recommender system [2]

Figure 1 presents a summary of the described strategies.

The aim of this paper is the description of a novel context-aware recommendation approach. The paper is organized as follows: Sect. 2 contains background and related works; in Sect. 3 the proposed approach is described, Sect. 4 presents the experimental phase exploited to evaluate the method described in Section 3; in Sect. 5 there are conclusions and future works.

2 Background and Related Works

In the recommender systems field, there is a great variety of strategies that are exploited in order to obtain rating forecasts.

Content-based methods can exploit term frequency and inverse document frequency in order to create profiles [22, 28]. Another technique exploited in content-based recommendation is Latent Dirichlet Allocation [28].

In memory-based recommendation, the clustering of users or items can be obtained through Pearson correlation [18] or through K-nearest neighbors algorithm [29]. In model-based recommendation, there are various factorization methods exploited such as probabilistic matrix factorization (PMF), non-negative matrix factorization (NMF) and singular value decomposition (SVD) [8]. In particular, singular value decomposition allows to factorize the ratings matrix \(R \in \mathbb {R}^{m \times n}\) that refers to a system of m users and n items, in the product of three matrices: the matrices \(U \in \mathbb {R}^{m \times m}\) and \(V \in \mathbb {R}^{n \times n}\) defined matrices of left and right singular vectors, and the matrix \(D = \text{ diag } \left( \sigma _1,\dots ,\sigma _p \right) \in \mathbb {R}^{m \times n}\) defined matrix of the singular values, where \(p=\min \{m,n\}\).

$$\begin{aligned} R = U D V^{\text {T}} \end{aligned}$$
(3)

Fixed the value \(k \in \mathbb {N} \text{: } k \le p\), the Eckart-Young Theorem [12] allows to approximate the rating matrix through the matrices \(U_k \in \mathbb {R}^{m \times k}\) obtained from U, \(V_k \in \mathbb {R}^{n \times k}\) obtained from V, and \(D_k \in \mathbb {R}^{k \times k}\) obtained from D.

$$\begin{aligned} R \approx U_k D_k V_k^T \end{aligned}$$
(4)

The Eckart-Young Theorem guarantees that the matrix \( R_k = U_k D_k V_k^T\) is the matrix of rank k that best approximates the ratings matrix R [12].

Matrix factorization is also exploited by context-aware RSs. Indeed, Baltrunas et al. [4] developed context-aware matrix factorization (CAMF), a contextual modeling method that support the matrix factorization with bias related to contextual information. Instead, splitting approaches [10, 32] are pre-filtering methods that select the proper row or columns of the ratings matrix and generate rating forecasts through the matrix factorization. These approaches divide the known rating in the specific context. Thus, there are more rows of rating matrix referred to the user \(u \in U\) (user splitting) or more columns of rating matrix referred to the item \(i \in I\) (item splitting) on the basis of the contextual information.

The Tensor factorization [21] is also exploited by context-aware RSs. Karatzoglou et al. [20] exploit high-order singular value decomposition (HOSVD) [3] in order to generate rating forecasts through a machine learning algorithm. Instead, Chen et al. [15] propose a multi-criteria recommender system that exploits stacked denoising autoencoder and CANDECOMP. In particular, canonical decomposition (CANDECOMP) allows to factorize tensor \(\mathcal {R} \in \mathbb {R}^{m \times n \times l}\) through the sum of s rank-1 tensors [21]:

$$\begin{aligned} \mathcal {R}_{ijz} = \sum _{h = 1}^s \lambda _h A_{ih} B_{jh} C_{zh} \end{aligned}$$
(5)

where \(A \in \mathbb {R}^{m \times s}\), \(B \in \mathbb {R}^{n \times s}\) and \(C \in \mathbb {R}^{l \times s}\) are matrices with columns of unitary norm and \(\lambda = \left( \lambda _1,\dots ,\lambda _s\right) \in \mathbb {R}^s\) (Fig. 2).

Fig. 2
figure 2

Graphical representation of CANDECOMP [21]

The CANDECOMP is exploited in Sect. 3 in order to describe a novel approach for Context-Aware Recommender Systems.

3 The Proposed Approach

In this section, the proposed approach is described. It consists of an heuristic method that exploits CANDECOMP and SVD properties in order to generate rating forecasts.

The system is supposed to have m users, n items and one contextual dimension that can assume l values. Thus, the ratings tensor \(\mathcal {R} \in \mathbb {R}^{m \times n \times l}\) has three dimensions. Instead, the matrix \(R \in \mathbb {R}^{m \times n}\) contains the ratings without context and can be given by dataset. If the dataset does not provide known ratings without context, they can be calculated through the average of all l contextual domains.

Firstly, the singular value decomposition role is evaluated. The relation (4) allows to approximate the known rating. Indeed, the product of matrices \( P = U_k \sqrt{D_k} \in \mathbb {R}^{m \times k}\) and \(Q = V_k \sqrt{D_k} \in \mathbb {R}^{n \times k}\) approximates the rating matrix:

$$\begin{aligned} R \approx P Q^T \end{aligned}$$
(6)

The relation (6) is made explicit trough the relation (7) where \(p_{ih} = {\left( U_k\right) }_{ih} \sqrt{\sigma _h}\) and \(q_{jh} = {\left( V_k\right) }_{jh} \sqrt{\sigma _h}\).

$$\begin{aligned} r_{ij} \approx \sum _{h=1}^k \sigma _h \left( U_k\right) _{ih} \left( V_k\right) _{hj} = \sum _{h=1}^k p_{ih} q_{jh} \end{aligned}$$
(7)

P and Q are matrices that create fake numerical profiles of users and items, respectively. The objective of the proposed approach is the construction of ratings forecasts through the calculation of the fake numerical profile of the l contextual dimensions. In order to achieve this purpose, the contextual dimension \(z \in \{1,\dots ,l \}\) is fixed, and the following hypothesis are done:

  • the values k of relation (4) and s of relation (5) coincide;

  • the matrix A of relation (5) is equal to \(U_k\) of relation (6);

  • the matrix B of relation (5) is equal to \(V_K\) of relation (6).

  • The relation \(\lambda _h = \sigma _h \times \gamma _h h=1,\ldots ,k\) is supposed valid, where \(\gamma _h > 0 h=1,\ldots ,k\)

The done hypothesis can be integrated in the relation (5).

$$\begin{aligned} \mathcal {R}_{ijz} = \sum _{h = 1}^k \lambda _h A_{ih} B_{jh} C_{zh} = \sum _{h = 1}^k \sigma _h \gamma _h \left( U_k \right) _{ih} \left( V_k \right) _{jh} C_{zh} \end{aligned}$$
(8)

The matrix \(W \in \mathbb {R}^{l \times k}\) has the elements \( W_{zh} = \gamma _h C_{zh}\) \(h=1,\dots ,k z = 1,\dots ,l\). Thus, the previous relation can be reformulated as follows:

$$\begin{aligned} \mathcal {R}_{ijz} = \sum _{h=1}^k P_{ih} Q_{jh} W_{zh} \end{aligned}$$
(9)

Let the element \(t \in \{1,\dots ,k\}\) fixed, the weighted average on items can be done as follows:

$$\begin{aligned} \frac{\sum _{j=1}^n \left( V_k\right) _{jt} \mathcal {R}_{ijz}}{\sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| } = \sum _{h=1}^k\frac{\sum _{j=1}^n \left( V_k\right) _{jt} P_{ih} Q_{jh}}{\sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| } W_{zh} \end{aligned}$$
(10)

Thus, the weighted average on users can be done:

$$\begin{aligned} \frac{\sum _{i=1}^m \sum _{j=1}^n \left( U_k\right) _{it} \left( V_k\right) _{jt} \mathcal {R}_{ijz}}{\left( \sum _{i=1}^m \left| \left( U_k\right) _{it} \right| \right) \left( \sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| \right) } = \sum _{h=1}^k\frac{\sum _{i=1}^m \sum _{j=1}^n \left( U_k\right) _{it} \left( V_k\right) _{jt} P_{ih} Q_{jh}}{\left( \sum _{i=1}^m \left| \left( U_k\right) _{it} \right| \right) \left( \sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| \right) } W_{zh} \end{aligned}$$
(11)

In order to simplify the relation (11), the following quantities are defined:

$$\begin{aligned} \bar{\mathcal {R}}_t^{\left( z\right) } = \frac{\sum _{i=1}^m \sum _{j=1}^n \left( U_k\right) _{it} \left( V_k\right) _{jt} \mathcal {R}_{ijz}}{\left( \sum _{i=1}^m \left| \left( U_k\right) _{it} \right| \right) \left( \sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| \right) } \quad t=1,\ldots ,k \end{aligned}$$
(12)
$$\begin{aligned} \alpha _{th}^{\left( z\right) } = \frac{\sum _{i=1}^m \sum _{j=1}^n \left( U_k\right) _{it} \left( V_k\right) _{jt} P_{ih} Q_{jh}}{\left( \sum _{i=1}^m \left| \left( U_k\right) _{it} \right| \right) \left( \sum _{j=1}^n \left| \left( V_k\right) _{jt} \right| \right) } \quad t,h=1,\ldots ,k \end{aligned}$$
(13)

The relations (12) and (13) allow to reformulate the relation (11) as follows:

$$\begin{aligned} \bar{\mathcal {R}}_t^{\left( z\right) } = \sum _{h=1}^k \alpha _{th}^{\left( z\right) } W_{zh} \quad t=1,\dots ,k \end{aligned}$$
(14)

The relation (14) consists of a linear system that allows to determinate the fake numerical profile \(\left( W_{z1},\dots ,W_{zk}\right) \) of the context dimension z. The resolution of all the l linear systems allows to identify the elements of the matrix \(W \in \mathbb {R}^{l \times k}\).

In the operative phase, the averages calculated are done on the known ratings in the system.

Finally, in order to improve the rating forecasts, contextual bias are exploited. Indeed, the rating forecasts are calculated as follows:

$$\begin{aligned} \hat{\mathcal {R}}_{ijz} = \bar{r}_z + b_{iz} + b_{jz} + \sum _{h=1}^k P_{ih} Q_{jh} W_{zh} \end{aligned}$$
(15)

where \(\bar{r}_z\) is the average of all known ratings in the context z, \(b_{iz}\) is the user i bias in the context z, \(b_{jz}\) is the item j bias in the context z and \(\sum _{h=1}^k P_{ih} Q_{jh} W_{zh}\) is seen as the affinity of user i and item j in the context z.

The principal advantage of the proposed approach is the possibility to calculate the context profiles in order to support the collaborative filtering recommendation method with a content-based one. Indeed, a hybrid recommendation method can be developed through a proper initialization of users and items profiles. The disadvantage of the proposed method is the need of known rating in order to calculate the context profiles.

4 Experimental Phase

In this section, the experimental phase is described. The numerical results are obtained through the dataset DePaulMovie [19, 33] that contains 5043 known ratings collected on 97 users, 79 items and 3 context dimensions. The 5043 ratings are divided in 1448 ratings without context and 3595 contextual ratings.

The contextual information exploited by the dataset are location that can assume the values home and cinema, companion that can assume the values alone, family and friends, and time that can assume the values weekend and weekday. Since the proposed approach is defined on one contextual dimension, the three contextual domain of DePaulMovie are taken into account as shown in Table 1. Indeed, the value of l is 12, the value of m is 97 and the value of n is 79.

Table 1 Values that context can assume in the proposed approach

The aim of the experimental phase is the evaluation of the proposed approach accuracy. In order to achieve this aim, mean absolute error (MAE) and root mean squared error (RMSE) are calculated according to the following formulas:

Table 2 Numerical results on DePaulMovie dataset
$$\begin{aligned} \text{ MAE } = \frac{1}{\left| D \right| } \sum _{\mathcal {R}_{ijz} \in D} \left| \mathcal {R}_{ijz} - \hat{\mathcal {R}}_{ijz} \right| \end{aligned}$$
(16)
$$\begin{aligned} \text{ RMSE } = \sqrt{\frac{1}{\left| D \right| } \sum _{\mathcal {R}_{ijz} \in D} \left( \mathcal {R}_{ijz} - \hat{\mathcal {R}}_{ijz} \right) ^2} \end{aligned}$$
(17)

where D is the dataset that contains the contextual ratings, \(\mathcal {R}_{ijz}\) is the known rating in the dataset D and \(\hat{\mathcal {R}}_{ijz}\) is the rating forecast provided by the context-aware recommender system. Moreover, the cross-validation fivefold [26] technique is exploited.

The results of the comparison methods are taken from CarsKit [33]. In Table 2 the numerical results are presented.

Table 2 proves that the proposed approach returns better results than the comparison methods.

5 Conclusions and Future Works

In this paper, a focus on recommender systems and context-aware recommender systems is done. In particular, the singular value decomposition and CANDECOMP are presented in Sect. 2, and they are exploited in Sect. 3 in order to present an heuristic context-aware recommender system. Finally, the numerical results are shown in Sect. 4. The experimental phase evidences that the proposed approach return better results than the comparison methods.

In order to improve the proposed approach, some improvements can be exploited. Since all items and users has the some contextual profile, a neighbor method will be exploited in order to divide items that improve their ratings in the contextual dimension and items that get worse their ratings in the contextual dimension. Moreover, others dataset is going to be exploited in order to confirm the goodness of the proposed approach.

Finally, a proper method in order to create users and items profiles is going to be developed in order to obtain an hybrid approach that enables the system to overcome the cold star problem related to lack of known ratings.