Keywords

1 Introduction

In recent years, as the rapid developing of social networks, people could share their experiences on Location-Based Social Network services, like Foursquare, Facebook and so on. People could also share their attitude to the POI points they have been on network services. To help people find the place they prefer much more, POI recommending has been proposed and plays an significant role in the stage of LBSN research.

POI Recommending:

The goal of POI recommendation is learning the preference of users and recommend the place to users which they are fond of. By now, to divide the types of users, Li [1] proposed the approach of recommend next POI for users by learning the intrinsic and extrinsic interests, which is named as IEMF approach. It argues that users sign outside the activity region only because of the extrinsic reason. But the problems with the IEMF approach are that users may not check in outside the activity region only because of extrinsic reasons, and the creation of the activity region is poorly defined. Our work is different from IEMF in building activity region and the recommend approach. Another current approach is the PG-rank [2] approach. It combines the geographical influence and the category influence. However, not all the users are affected by the geographical influence, for the POI points inside the activity region produces few influence.

We suggest that POI recommending should add in geographical influence appropriately for it is a significant factor when users are making decisions. For example, user may check-in in a restaurant famous in social networks, or he may give up that thought because of the long distance. On the other hand, we suggest that, we couldn’t add geographical influence to all the POI recommendation, for it doesn’t play a significant role when it comes to the situation inside their activity region. For example, you may not change your attitude of whether or not check-in in the nearer restaurant because of the distance.

To solve the aforementioned problem, we propose a new Bayesian recommendation approach via location and category influence named ABPR (Activity region Bayesian Personalized Ranking), the purpose of this approach is to analyze the record of POI point of users, then build the activity region. For the POI point inside the region, we are doing the recommending via category approach; for the outside ones, we perform the recommending by using the geographical influence and the category influence, we also updated the parameter with SGD algorithm. Try to imagine that you want to have a meal, there is a restaurant you always check-in, will you change your decision to have a meal in other POI point much closer to where you live?

We think that the related research has the following insufficient parts:

  1. (1)

    When adding the geographical influence, most of the research add the geographical influence into all the POI points, but when it comes to the POI points that are close the user, geographical influence is not the most significant reason influencing the users’ decision;

  2. (2)

    The research involving activity region always builds it with circles, and it may not accurately catch the users’ frequently activity region;

  3. (3)

    Some users like to explore new POI points, others may prefer the similar one. This has not been well analyzed by recently study.

To solve these problems, we propose the solution of the above– our ABPR approach. The advantage of our approach is:

  1. (1)

    We do recommendation via category inside the activity region without the geographical influence;

  2. (2)

    We calculate the activity region by using the approach of loss function to promote the precision;

  3. (3)

    We add a parameter to measure the users’ preference to explore new places.

2 Related Work

POI recommending has received a lot of attention recently, and it could be classified into the following parts:

Geographical Influence Based Recommending

There are tremendous amount of research concerning the geographical influence based recommendation. Some scholars argue users may be affected by the geographical influence to change their mind in check-in behavior in the LBSN. By this mean, [3] first added the geographical influence into the POI recommendation. Then, [4] prominent the influence of geographical influence by using the Gaussian Mixture Model (GMM) and the Matrix Factorization approach (MF), but the reason of check-in is complex and it isn’t being analyzed concretely. [5, 6] proved that geographical influence could increase the precision of algorithms, however, both of them didn’t take in the geographical influence appropriately. [7] proposed the Potential Dirichlet Distribution to integrate the location-based model. This research indicated that adding the geographical influence may promote the performance of algorithm.

Category Influence Based Recommending

Category based recommending is the earliest and the most widely used approach. [8] proposed the Bayesian Personalized Ranking approach (BPR) which add the thought of Bayesian into POI recommending research in order to analyze the intrinsic information concealed in the check-in data. By this means, [9] proposed the approach of LBPR via listwise Bayesian Personalized Ranking to solve the problem among the successive recommending. [10] also analyzed the last patterns of users to recommend next POI point that user may be most interest in; however, the disadvantage of their research is that they didn’t separate the different types of users.

Time-Aware Based Recommending

The newly developed approach aimed to catch the temporal feature of activities, which is well studied recently. Because of the regularity of human daily activity, temporal influenced recommendation helps researchers to catch the preference of POI points much more precisely. [15] firstly add the temporal influence into POI recommendation, [11] divide the time slot into days, months and years, but the time slot is too long to catch the users’ activity pattern.

Social Influence Based Recommending

Social influence based recommend is inspired by the intuition that there are some common interests of friends in the LBSN. [12] showed the social relations exert more influence on long distanced travel. [13] proposed the model of PMRE-GTS to show the connect of social relation and temporal influence.

Because of the sparsity and time complexity of the dataset, most of literature only build their models with two influences. And for the long-distance POI points, most of them are not ideal.

3 The ABPR Model

The recommendation task is defined as following:

Given the check-in behaviors of users and the locations of the POI points, we aim at recommending the Top-K locations users might be interested in but never visited before or not often check-in.

3.1 Our Framework

Activity Regions.

People’s activity has some type of scope [14], and the recent research mostly added the geographical influence into all the POI points. But the geographical influence effects are much lower than we considered inside the region.

Based on such considerations, we propose the concept of activity region. Users usually check-in around one or more regions, such as school, work place and home. We consider that if adding the geographical influence into the recommendation, it will reduce its precision. So, we use the SGD method to create the activity region. We performed category based recommending inside the region, performed the category and geographical based recommending outside the region. For the geographical influence may tremendously influence the check-in decision. Figure 1 shows the idea of activity region.

Fig. 1.
figure 1

The activity region (Color figure online)

The blue circle is the activity region, different color of points shows the different reason of check in.

For the outside points, we add in the geographical influence, here is the lost function of building the activity region:

$$ J(\theta_{\text{a}} ) = \frac{1}{2m}\sum\limits_{i = 1}^{m} {(t_{{\theta_{a} }} (x^{(i)} ) - y^{(i)} )^{2} } $$
(1)

\( x^{(i)} \) is the coordinate of check-in location, \( y^{(i)} \) is activity region, \( t_{{\theta_{a} }} \) is the matrix of POI points, \( \theta_{a} \) is the parameter matrix of t, \( (t_{{\theta_{a} }} (x^{i} ) - y^{i} )^{2} \) is the distance from outside POI points to the activity. We minimize the sum of the distance:

$$ A(\theta_{\text{a}} ) = \arg \hbox{min} \frac{1}{2m}\sum\limits_{i = 1}^{m} {(t_{{\theta_{a} }} (x^{(i)} ) - y^{(i)} )^{2} } $$
(2)

To minimize this formula, we use the SGD method to calculate the parameter metrix \( \theta_{a} \) and get the activity region, the process of calculating \( \theta_{a} \) is:

$$ \frac{{\partial A(\theta_{\alpha } )}}{{\partial \theta_{a} }} = \frac{1}{m}\sum\nolimits_{i = 1}^{m} {(t_{{\theta_{a} }} (x^{i} ) - y^{i} )x^{i} } $$
(3)

If we want to get the Minimum value of \( A(\theta ) \), then \( \frac{{\partial A(\theta_{\alpha } )}}{{\partial \theta_{a} }} \) should be 0. During the iteration, we use \( \alpha_{1} \) as the step of iteration, the process of iteration is as the following:

$$ \theta_{a} = \theta_{a} - \alpha_{1} (\frac{1}{m}\sum\nolimits_{i = 1}^{m} {(t_{{\theta_{a} }} (x^{i} ) - y^{i} )x^{i} )} $$
(4)

until it convergences.

The Process of Naive Bayesian.

posteriori probability could be define as:

$$ P(l\left| {l_{u} } \right.) \propto P(l)P(l_{u} \left| l \right.) $$
(5)

And it equals to:

$$ P(l\left| {l_{u} } \right.) = P(l)\prod\limits_{{l' \in l_{u} }} {P(l'\left| l \right.)} $$
(6)

Where \( P(l) \) is the posteriori probability of check-in in POI point l, lu is the set of history check-in points, l’ is the point of history check-in.

The geographical Influence.

Users may be influenced by the geographical influence and category influence to check-in outside the activity region. By this means, we add the geographical influence into the process of POI recommendation.

We use \( U \in R^{k*1} \) matrix as the set of users, \( l \in L^{n*1} \) as the set of POI points, \( \theta_{l} \) to as the parameter of matrix. Here we use the Euclid distance \( D_{{l_{i} l_{j} }} \) between the users and the POI points.

Users may be influenced by the geographical influence, in other words, the distance between the users and the POI points may influence the check-in decision. For example, because of the distance, we may go to check-in in the shop 3 km from where you live, and we may not check-in in the 6 km one. We define \( \left| {X(l_{i} ) - X(l_{c} )} \right|^{2} \) as the distance from POI points to activity region. If it is much more near to the user, we define it as follow:

$$ \left| {X(l_{i} ) - X(l_{c} )} \right|^{2} - \left| {X(l_{j} ) - X(l_{c} )} \right|^{2} $$
(7)

We use this formula to calculate li or lj is closer to the activity region. If \( D_{{l_{c} l_{i} }} - D_{{l_{c} l_{j} }} < 0 \), To make it easier to array, we negate it as:

$$ - \left( {D_{{l_{c} l_{i} }} - D_{{l_{c} l_{j} }} } \right) > 0 $$
(8)

The Desire to Explore the New POI Point.

In our daily lives, some people desire to explore the new POI points, while some people prefers to check-in to the similar ones. Until now, few research involves the desire of individuals to explore. We would like to introduce the desire parameter \( \chi \) to define the desire to explore the new POI point. If the users may desire to explore the similar ones, we may recommend the POI points based on the category factor.

We define the check-in set as U, \( \vartheta_{u} \) as the number of the check-in category, \( \eta \) as the frequency of check-in, we divide \( \vartheta_{u} \) by U as follow:

$$ \chi = \frac{{\vartheta_{u} }}{\eta } $$
(9)

Clearly, it is a constant, if it is above the value \( \theta_{q} \), we may consider that the user prefers to check-in in the similar one.

3.2 Parameter Estimation

According to the Bayesian personalized ranking thought, we use the maximum likelihood posteriori probability to build the target formula, and then doing iteration to calculate the parameter and combine the activity region, geographical influence and desire parameter. To optimize the parameter, its derivation process is given by the following formula:

$$ \gamma = {\text{argmax}}\prod\limits_{{{\text{u}} \in {\text{U}}}} {\prod\limits_{{{\text{l}} \in L^{n*1} }} {\prod\limits_{{l_{i} \in L^{n*1} }} {\prod\limits_{{l_{j} \in L^{n*1} }} {\chi {\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )} } } } {\text{P}}(\theta ) $$
(10)

We define \( \gamma \) as the probability of check-in, The parameter \( \theta \) = {X(L), X(U), X(a), X(q)} is the parameter that needs to be optimized, we change it by using the sigmoid formula \( \sigma = \frac{1}{{1 + e^{ - x} }} \), we define the probability as:

$$ {\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )= \sigma (x_{u,i,j} ) $$
(11)

So the optimize formula could be changed as:

$$ \begin{aligned} \gamma & = {\text{argmax}}\,{ \log }\prod\limits_{{{\text{u}} \in {\text{U}}}} {\prod\limits_{{{\text{l}} \in L^{n*1} }} {\prod\limits_{{{\text{a}} \in R^{ + } }} {\prod\limits_{{{\text{q}} \in {\text{R}}^{ + } }} {\chi {\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )} } } } {\text{P}}(\theta ) \\ & = {\text{argmin}}\sum\limits_{u \in U} {\sum\limits_{{l \in L^{n*L} }} {\sum\limits_{{l_{i} \in L^{n*1} }} {\sum\limits_{{l_{j} \in L^{n*1} }} { - \log \,\sigma (x_{u,i,j} )} } } } + {\text{logP}}(\theta ) + \theta_{q} \chi \\ \end{aligned} $$
(12)

The above formula is inside the activity region, when it comes to the outside the region, it can be defined as:

$$ = {\text{argmin}}\sum\limits_{u \in U} {\sum\limits_{{l \in L^{n*L} }} {\sum\limits_{{l_{i} \in L^{n*1} }} {\sum\limits_{{l_{j} \in L^{n*1} }} {[ - \log \sigma (x_{u,i,j} )} } } } + \lambda_{\theta } \left\| {\theta^{2} } \right\| + \theta_{q} \chi ] + A(\theta_{a} ) $$
(13)

For the outside ones, we define the geographical influence as the most significant factor, based on this idea, we change the above formula as:

$$ \gamma = {\text{argmax}}\prod\limits_{{{\text{u}} \in {\text{U}}}} {\prod\limits_{{{\text{l}} \in L^{n*1} }} {\prod\limits_{{l_{i} \in L^{n*1} }} {\prod\limits_{{l_{j} \in L^{n*1} }} {{\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )} } } } {\text{P}}(\theta ) $$
(14)

Well the \( {\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )= {\text{P}}( - \left( {D_{{l_{c} l_{i} }} - D_{{l_{c} l_{j} }} } \right) > 0\left| \theta \right.) \)

$$ = - \sigma (D_{{l_{c} l_{i} }} - D_{{l_{c} l_{j} }} ) $$
(15)

So it could be changed as:

$$ \begin{aligned} \gamma & = {\text{argmax log}}\prod\limits_{{{\text{u}} \in {\text{U}}}} {\prod\limits_{{{\text{l}} \in L^{n*1} }} {\prod\limits_{{{\text{a}} \in R^{ + } }} {\prod\limits_{{{\text{q}} \in {\text{R}}^{ + } }} {{\text{P(>}}_{{{\text{u,l}} \in {\text{a}}}} \left| \theta \right. )} } } } {\text{P}}(\theta ) \\ & = {\text{argmin}}\sum\limits_{u \in U} {\sum\limits_{{l \in L^{n*L} }} {\sum\limits_{{a \in R^{ + } }} {\sum\limits_{{q \in R^{ + } }} { - \log \sigma (D_{{l_{c} l_{i} }} - D_{{l_{c} l_{j} }} )} } } } + \lambda_{\theta } \left\| {\theta^{2} } \right\| \\ \end{aligned} $$
(16)

3.3 Algorithmic Description

We firstly extract all the train set, then optimize the parameter, we could get the check-in dataset \( \psi \), it contains \( \{ u,l^{c} ,\vartheta_{u} ,\eta \} \), the four parameter defines users; the users’ location; the location of POI points and its category; the frequency of check-in. The approach of update the parameter is:

$$ \theta \leftarrow \theta - \alpha \frac{\partial \gamma }{\partial \theta } $$
(17)

The \( \alpha \) defines the learning step of SGD. To express the idea of the algorithm more Intuitively, we list the process of the algorithm as follow:

figure a

4 Experiments

4.1 Dataset

We use the Gowalla and Foursquare datasets to evaluate the model performance. each check-in record contains the user’s ID and the POI point location, each location has the latitude and the longitude information. The data statistics are shown in Table 1.

Table 1. The data statistics

4.2 Evaluate Parameter

We quantitatively evaluate the model performance by using top-K recommending performance, i.e, Precision@K and Recall@K. They are defined as following:

$$ {\text{Precision@K = }}\frac{ 1}{\text{n}}\sum\limits_{i = 1}^{n} {\frac{{S_{i} (K) \cap \tau_{i} }}{K}} $$
(18)
$$ {\text{Recall@K = }}\frac{ 1}{\text{n}}\sum\limits_{i = 1}^{n} {\frac{{S_{i} (K) \cap \tau_{i} }}{{\left| {\tau_{i} } \right|}}} $$
(19)

Where \( S_{i} (K) \) is a set of top-K unvisited locations recommend to user i excluding those locations in the training, \( \tau_{i} \) is the set of locations that are visited by users i.

4.3 Performance Comparison

We compare the proposed model with the following methods:

  • BPR [8]: optimizes the ordering relationship of users’ preferences for the observed and unobserved location;

  • USG [3]: incorporates the geographical influence,social network and user interest into collaborative filtering in an additive manner.

  • IEMF [1]: learning users’ intrinsic and extrinsic interests for POI recommendation.

We experiment these methods and our ABPR with the real world datasets, and here is the comparison result (Fig. 2):

Fig. 2.
figure 2

Performance comparison

We evaluated the model performance by using precision@K and recall@K, our model gains about 15.4% and 17.3% improvement in precision@5 and recall@20 in Gowalla data; 12.6% and 13.8% in Foursquare data. Through analyzing the experimental result, we found the following observations.

First of all, our method outperforms all other methods. This superior result is for modeling the users’ preference. By this means, we’ve done it better.

Secondly, the methods with geographical influence (USG, IEMF, ABPR) performs much better than the one without geographical influence (BPR). This further illustrates the benefits of doing POI recommending with geographical influence.

Lastly, we’d like to take about the influence of the number of iterations of building activity region, and the outperform of the method of building the activity region. The influence of it will be illustrated in Fig. 3. The red line of the figure shows our precision of our activity region, the green one shows the method of STELLAR [16], it adds the geographical influence to separate the POI points only with distance.

Fig. 3.
figure 3

Influence of the number of iterations (Color figure online)

Clearly, our method of building the activity region of ABPR performs much more better than the STELLAR [16] of adding geographical influence. STELLAR abandons the POI points over 10 km far. To illustrate that, we compared with the two methods to show the advantage of our method.

We can also conclude from the figure that when the iteration comes to more than 100, it convergences. So we suggest that the iteration may be getting its best situation when more than100 iteration is performed.

We also analyzed how does POI points effect the performance of activity region. The Fig. 4 shows the influence:

Fig. 4.
figure 4

The connection between activity and the POI number

We found that our method does well when there are more than 30 POI points. The precision is not affected as much when the number of POI points changes, it remains at 80%. However, it affects much on the percentage of recall. The F1 score shows when POI point drops down to lower than 30, the score disparity of precision and recall gets higher. If the POI points are lower than 30, it is undesirable in building the activity region. Here are four examples of the situation of it (Fig. 5):

Fig. 5.
figure 5

Four situations of the POI points lower than 40 and more than 40

It seems that our approach is undesirable when the POI points is lower than 30 for the algorithm try to contain all the POI points that are being checked-in, but it does well in other situations, like the forth figure which has 137 POI points.

5 Conclusion

To further the study of personalized recommending, in this paper, we propose a new method to catch the preference of users based on the geographical influence and the category influence. We use the approach of SGD to estimate the parameter, and performance evaluation in the real world LBSN datasets shows our approach of POI recommending outperforms other methods. More specifically, our proposed method is more capable in distinguishing different types of users. For future works, we could consider the social relations and temporal influences on personalized recommending.