Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency

Lim, Kwan Hui; Chan, Jeffrey; Leckie, Christopher; Karunasekera, Shanika

doi:10.1007/s10115-017-1056-y

Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency

Regular Paper
Published: 05 May 2017

Volume 54, pages 375–406, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Knowledge and Information Systems Aims and scope Submit manuscript

Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency

Download PDF

Kwan Hui Lim ORCID: orcid.org/0000-0002-4569-0901^1,2,
Jeffrey Chan^1,3,
Christopher Leckie¹ &
…
Shanika Karunasekera¹

2901 Accesses
128 Citations
3 Altmetric
Explore all metrics

Abstract

Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need to select points of interest (POI) to visit in unfamiliar cities and to select POIs that align with their interest preferences and trip constraints. We propose an algorithm called PersTour for recommending personalized tours using POI popularity and user interest preferences, which are automatically derived from real-life travel sequences based on geo-tagged photographs. Our tour recommendation problem is modeled using a formulation of the Orienteering problem and considers user trip constraints such as time limits and the need to start and end at specific POIs. In our work, we also reflect levels of user interest based on visit durations and demonstrate how POI visit duration can be personalized using this time-based user interest. Furthermore, we demonstrate how PersTour can be further enhanced by: (i) a weighted updating of user interests based on the recency of their POI visits and (ii) an automatic weighting between POI popularity and user interests based on the tourist’s activity level. Using a Flickr dataset of ten cities, our experiments show the effectiveness of PersTour against various collaborative filtering and greedy-based baselines, in terms of tour popularity, interest, recall, precision and F$_1$-score. In particular, our results show the merits of using time-based user interest and personalized POI visit durations, compared to the current practice of using frequency-based user interest and average visit durations.

PoI Recommendation System: A Blended Approach

Top-k Context-Aware Tour Recommendations for Groups

A personalized POI route recommendation system based on heterogeneous tourism data and sequential pattern mining

Article 20 August 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Tour recommendation and itinerary planning are challenging tasks due to the different interest preferences and trip constraints (e.g., time limits, start and end points) of each unique tourist.^{Footnote 1} While there is an abundance of information from the Internet and travel guides, many of these resources simply recommend individual points of interest (POI) that are deemed to be popular, but otherwise do not appeal to the interest preferences of users or adhere to their trip constraints. Furthermore, the massive volume of information makes it a challenge for tourists to narrow down to a potential set of POIs to visit in an unfamiliar city. Even after the tourist finds a suitable set of POIs to visit, it will take considerable time and effort for the tourist to plan the appropriate duration of visit at each POI and the order in which to visit the POIs.

To address these issues, we propose the PersTour algorithm for recommending personalized tours where the suggested POIs are optimized to the users’ interest preferences and POI popularity. We formulate our tour recommendation problem based on the Orienteering problem [42], which considers a user’s trip constraints such as time limitations and the need for the tour to start and end at specific POIs (e.g., POIs near the tourist’s hotel). Using geo-tagged photographs as a proxy for tourist visits, we are able to extract real-life user travel histories, which can then be used to automatically determine a user’s interest level in various POI categories (e.g., parks, beaches, shopping) as well as the popularity of individual POIs. As tourists have different preference levels between POI popularity and POI relevance to their interests, our PersTour algorithm also allows tourists to indicate their preferred level of trade-off between POI popularity and his/her interest preferences. In cases where the tourist prefers to automate the indication of this trade-off between POI popularity and interest preference, PersTour is also able to determine the appropriate trade-off based on the activity level of the tourist relative to the POI visits of the general population.

Our main contributions^{Footnote 2} are as follows:

1.
We propose the PersTour algorithm for recommending personalized tour/trip itineraries with POIs and visit duration based on POI popularity, users’ interest preferences and trip constraints. Our tour recommendation problem is modeled in the context of the Orienteering problem (Sect. 3).
2.
We introduce the concept of time-based user interest for tour recommendation, where a user’s level of interest in a POI category is based on his/her time spent at such POIs, relative to the average user. We also compare our time-based user interest to the current practice of using frequency-based user interest and show how time-based user interest results in recommended tours that more accurately reflect real-life travel sequences (Sect. 3.1).
3.
We also further enhance time-based user interest by implementing an update rule such that user interests are refined based on the recency of their past POI visits. This updating works by giving more emphasis to recent POI visits than those in the more distant past (Sect. 3.1).
4.
We demonstrate the personalization of POI visit duration using time-based user interest, for the purpose of tour/trip itinerary recommendation. Our results show that personalized visit durations more accurately reflect the real-life POI visit durations of users, compared to the current practice of using average visit duration (Sect. 3.1).
5.
While the PersTour algorithm gives tourists the flexibility to indicate their preferred weightage between POI popularity and his/her interests, we also propose two schemes to automatically determine an appropriate weightage based on the tourist’s activity level, relative to the general tourist population (Sect. 3.2.1).
6.
We implement a framework (Fig. 1) for extracting real-life user travel histories based on their geo-tagged photographs, which are then used for training our PersTour algorithm and serve as ground truth for our subsequent evaluation (Sect. 4).
7.
We evaluate different variants of PersTour against various baselines using a Flickr dataset spanning ten cities. Our results show that PersTour out-performs these baselines based on tour popularity, user interest, recall, precision and F$_1$-score (Sects. 5 and 6).

The rest of the paper is structured as follows: Sect. 2 discusses some related work in tour recommendation; Sect. 3 introduces some preliminaries and defines our research problem; Sect. 4 describes our overall framework for tour recommendation; Sect. 5 outlines our experimental methodology; Sect. 6 discusses our main results and key findings; and Sect. 7 summarizes and concludes our paper.

2 Related work

Tour recommendation has been a well-studied field, with many developed applications [5, 7, 29, 43, 47] and research ranging from recommending beautiful, quiet and happy tours [33] to tour recommendation using random walks with restart [30]. In our review of related work, we focus on research related to our work and refer readers to [12, 38] for an overview on the general field of tour recommendation. In the following sections, we provide an overview of the Orienteering problem before highlighting some key works in tour itinerary recommendations.

2.1 Background on the Orienteering problem

The Orienteering problem [42] originated from a competition of the same name. In this Orienteering competition, there are multiple navigational checkpoints distributed throughout an area, where each checkpoint is associated with a certain score. The main objective of participants in this competition is to maximize their total score, which is accumulated from visiting the various checkpoints. Participants are only given a limited amount of time to maximize their scores, and the winner is the participant who has accumulated the highest score. Due to this limitation of time, participants have to strategize and select a smaller subset of checkpoints to visit and decide on the sequence to visit these checkpoints. For a more in-depth review of the Orienteering problem, we refer readers to [14, 45]. In recent years, various works have used the Orienteering problem to model different variations of the tour recommendation problem, and we discuss some of these works.

2.2 Tour recommendation based on Orienteering problem and its variants

Many tour itinerary recommendation works are based on the Orienteering problem and its variants. For example, Choudhury et al. [9] was one of the earlier tour recommendation studies based on the Orienteering problem, where recommended tours start and end at specific POIs while trying to maximize an objective score. Using a modified Orienteering problem, Gionis et al. [13] utilized POI categories such that recommended tours are constrained by a POI category visit order (e.g., museum $\rightarrow $ park $\rightarrow $ beach). Similarly, Lim [24] used a modified Orienteering problem constrained by a mandatory POI category, which corresponds to the POI category a user is most interested in. Based on user-indicated interests and trip constraints (e.g., time budget, start and end locations), Vansteenwegen et al. [44] recommended tours comprising POI categories that best match user interests while adhering to these trip constraints. In the context of theme parks, Lim et al. [25] recommended personalized itineraries with minimal queuing times at attractions, while maximizing user interests and attraction popularity. Others like [1, 28] have extended the Orienteering problem for the purpose of recommending tour itineraries for groups of tourists, with the aim of satisfying the diverse interest preferences of multiple tourists in a group.

2.3 Tour recommendation based on other combinatorial optimization problems

In contrast to works based on the Orienteering problem, there are also various tour itinerary recommendation works based on other combinatorial optimization and similar problems. For example, Brilhante et al. [4] formulated tour recommendation as a Generalized Maximum Coverage problem [10], with the objective of finding an optimal set of POIs based on both POI popularity and user interest. Thereafter, Brilhante et al. [6] extended upon the former by using a variation of the Traveling Salesman Problem, with the main aim of finding the shortest route among the set of optimal POIs recommended in [4]. In addition to user interests in tour recommendation, Chen et al. [8] also considered traveling times based on different traffic conditions, using trajectory patterns derived from taxi GPS traces. Focusing on traveling paths based on road segments between POIs, Sun et al. [40] recommended tour itineraries comprising popular POIs and interesting routes between these POIs, with POI and route popularity based on geo-tagged photographs. With further considerations for different transport modes, Kurashima et al. [19, 20] used a combined topic and Markov model to recommend tours based on both user interests and frequently traveled routes.

2.4 Top-k POI recommendation and next-location prediction

The recommendation of top-k POIs and next-location predictions is also closely related to our problem of tour itinerary recommendation. For example, LearNext [2] used Gradient Boosted Regression Trees and Ranking SVMs to predict the (single) next POI that a tourist will visit, while [49] performed a similar next-location prediction task using Markov models, along with seasonal and temporal information. Others like [36] used a category-regularized matrix factorization approach for recommending individual POIs, and Kofler et al. [17] proposed a system prototype for recommending individual POIs that are niche and specialized in nature. For top-k POI recommendations, many works utilized variants of matrix factorization or collaborative filtering approaches to recommend a ranked list of k POIs, using information such as friendship links [50], types of activities/users [21] and temporal patterns in POI visits [52].

2.5 Other tourism-related work

There are also many interesting tourism-related studies that utilize geo-tagged photographs for purposes ranging from identifying popular POIs to analyzing tourist behavior. For example, Ji et al. [15] implemented a graph modeling framework to identify popular POIs based on photographs posted in blogs, while [32] used geo-tagged photographs to understand tourist behavior based on their POI visit patterns and time spent. More generally, geo-tagged photographs have been used for other purposes such as predicting friendship relationships based on spatiotemporal links [11], identifying local clusters of interesting events and places [16] and estimating the location where a photograph is taken [22]. For a more comprehensive discussion of research that utilizes geo-tagged photographs, we direct readers to [39], who presented a comprehensive review of current applications and identified various interesting future directions.

2.6 Discussion of differences with previous work

While these previous works are the state of the art in tourism-related research, our proposed work differs from these earlier works in various aspects. First, we automatically derive a relative measure of time-based user interest using a user’s visit durations at POIs of a specific category, relative to the average visit durations of other users, whereas earlier tour recommendation works either use frequency-based user interest (based on POI visit frequency) or require users to explicitly indicate their interest preferences for tour itinerary recommendation. Second, we plan and recommend tour itineraries with personalized POI visit durations that cater to individual users based on their time-based user interests, whereas previous works recommend tour itineraries using the same non-personalized POI visit duration for all users (either the average duration or a fixed duration, e.g., 1 h at all POIs) or do not consider POI visit duration at all. Third, although the works on top-k POI recommendation and next-location prediction are related to our tour itinerary recommendation problem, our proposed problem involves the additional considerations of user interest preferences, POI popularity, time constraints, starting/ending locations and more importantly, recommending a connected tour itinerary that satisfies these considerations, instead of individual POIs. While the other tourism-related works illustrate many interesting applications of geo-tagged photographs, these works use such photographs to study tourist behavior and identify popular POIs, which are distinctly different from the task of recommending a personalized tour itinerary.

3 Background and Problem definition

In this section, we first examine some preliminary definitions, before introducing a formulation of our tour recommendation problem.

3.1 Preliminaries

If there are m POIs for a particular city, let $P = \{p_1, \ldots ,p_m\}$ be the set of POIs in that city. Each POI p is also labeled with a category $\textit{Cat}_p$ (e.g., church, park, beach) and latitude/longitude coordinates. We denote a function $\textit{Pop}(p)$ that indicates the popularity of a POI p, based on the number of times POI p has been visited. Similarly, the function $T^{\textit{Travel}}(p_x, p_y)$ measures the time needed to travel from POI $p_x$ to $p_y$, based on the distance between POIs $p_x$ and $p_y$ and the indicated traveling speed. For simplicity, we use a traveling speed of 4 km/h, i.e., a leisure walking speed.^{Footnote 3}

Definition 1

Travel history Given a user u who has visited n POIs, we define his/her travel history as an ordered sequence, $S_u = ((p_1, t^a_{p_1}, t^d_{p_1}), \ldots ,(p_n, t^a_{p_n}, t^d_{p_n}))$, with each triplet $(p_x, t^a_{p_x}, t^d_{p_x})$ comprising the visited POI $p_x$, and the arrival time $t^a_{p_x}$ and departure time $t^d_{p_x}$ at POI $p_x$. Thus, the visit duration at POI $p_x$ can be determined by the difference between $t^a_{p_x}$ and $t^d_{p_x}$. Similarly, for a travel sequence $S_u$, $t^a_{p_1}$ and $t^d_{p_n}$ also indicate the start and end time of the itinerary, respectively. For brevity, we simplify $S_u = ((p_1, t^a_{p_1}, t^d_{p_1}), \ldots ,(p_n, t^a_{p_n}, t^d_{p_n}))$ as $S_u = (p_1, \ldots ,p_n)$.

Definition 2

Travel sequence Based on the travel history $S_u$ of a user u, we can further divide this travel history into multiple travel sequences, i.e., sub-sequences of $S_u$. We divide a travel history $S_u$ into separate travel sequences if $t^d_{p_x} - t^a_{p_x+1} > \tau $. That is, we separate a travel history into distinct travel sequences if the consecutive POI visits occur more than $\tau $ time units apart. Similar to other works [9, 24], we choose $\tau =8~h$ in our experiments. These travel sequences also serve as the ground truth of real-life user trajectories, which are subsequently used for evaluating our PersTour algorithm and baselines. For a user u with n travel sequences, we use $S^1_u, S^2_u, \ldots ,S^n_u$ to denote the different travel sequences in temporal order, such that $S^1_u$ took place before $S^2_u$.

Definition 3

Average POI visit duration Given a set of travel histories for all users U, we determine the average visit duration for a POI p as follows:

$$\begin{aligned} {\bar{V}}(p) = \frac{1}{n} {\sum \limits _{u \in U}} {\sum \limits _{p_x \in S_u} \left( t^d_{p_x} - t^a_{p_x}\right) } \delta (p_x=p), \quad \forall ~p \in P \end{aligned}$$

(1)

where n is the number of visits to POI p by all users and $\delta (p_x=p)=\{^{1,~ p_x=p}_{0,~ \hbox {otherwise}}$. ${\bar{V}}(p)$ is commonly used in tour recommendation as the POI visit duration for all users [4, 6, 8], while many earlier works do not factor in POI visit durations at all. In our work, we show how recommended POI visit durations can be personalized to individual users based on their interest (Definition 5), and use ${\bar{V}}(p)$ as a comparison baseline, i.e., the non-personalized POI visit duration.

Definition 4

Time-based user interest As described earlier, the category of a POI p is denoted $\textit{Cat}_p$. Given that C represents the set of all POI categories, we determine the interest of a user u in POI category c as follows:

$$\begin{aligned} \textit{Int}^{\textit{Time}}_u(c) = \sum \limits _{p_x \in S_u} \frac{\left( t^d_{p_x} - t^a_{p_x}\right) }{{\bar{V}}(p_x)} \delta (\textit{Cat}_{p_x}=c), \quad \forall ~c \in C \end{aligned}$$

(2)

where $\delta (\textit{Cat}_{p_x}=c)=\{^{1,~ \textit{Cat}_{p_x}=c}_{0,~ \hbox {otherwise}}$. In short, Eq. 2 determines the interest of a user u in a particular POI category c, based on his/her time spent at each POI of category c, relative to the average visit duration (of all users) at the same POI. The rationale is that a user is likely to spend more time at a POI that he/she is interested in. Thus, by calculating how much more (or less) time a user is spending at POIs of a certain category compared to the average user, we can determine the interest level of this user in POIs of this category.

Definition 5

Personalized POI visit duration Based on our definition of time-based user interest (Eq. 2), we are able to personalize the recommended visit duration at each POI based on each user’s interest level. We determine the personalized visit duration at a POI p for a user u as follows:

$$\begin{aligned} T^{\textit{Visit}}_u(p) = {\textit{Int}^{\textit{Time}}_u(\textit{Cat}_p)} \times {{\bar{V}}(p)} \end{aligned}$$

(3)

That is, we are recommending a personalized POI visit duration based on user u’s relative interest level in category $\textit{Cat}_p$ multiplied by the average time spent at POI p. Thus, if a user is more (less) interested in category $\textit{Cat}_p$, he/she will spend more (less) time at POI p than the average user.

Definition 6

Frequency-based user interest We also define a simplified version of user interest, denoted $\textit{Int}^{\textit{Freq}}_u(c)$, which is based on the number of times a user visits POIs of a certain category c (i.e., the more times a user visits POIs of a specific category, the more interested this user is in that category). As using $\textit{Int}^{\textit{Freq}}_u(c)$ is the current practice in tour recommendation research [4, 6, 24], we include it for a more complete study and as a comparison baseline to our proposed $\textit{Int}^{\textit{Time}}_u(c)$.

Definition 7

Time-based user interest with weighted updates We improve upon the original time-based user interest (Definition 4) by giving more emphasis to recent POI visits and less emphasis to POI visits in the distant past. Algorithm 1 details our proposed algorithm. In Line 9 of Algorithm 1, we continuously update user u’s interest by minimizing the error between his/her recommended and actual POI visit duration, while $\frac{i}{n}$ ensures that more emphasis is given to more recent POI visits. Lines 6 to 8 calculate the error between the recommended and actual POI visit duration, while Lines 4 and 5 ensure that we perform this update for all POIs in all travel sequences of user u.

The intuition behind Algorithm 1 is that more recent POI visits are more relevant to a user and thus should contribute more to the modeling of this user’s interest. Similarly, other researchers have also observed people’s preference for more recent activities/information and utilized this recency preference for next check-in location prediction [26], location-based domain expert identification [23] and personalized music recommendation [35].

Definition 8

Personalized POI visit duration with weighted updates Similar to Definition 5, we can then recommend a personalized POI visit duration to POI p for a user u based on his/her time-based user interest with weighted updates, as follows:

$$\begin{aligned} T^{\textit{VisitUpd}}_u(p) = {\textit{Int}^{\textit{Upd}}_u(\textit{Cat}_p)} \times {{\bar{V}}(p)} \end{aligned}$$

(4)

Similar to Definition 5, we are personalizing the POI visit duration for user u based on his/her updated interest level in category $\textit{Cat}_p$ multiplied by the average time that users spend at POI p.

3.2 Problem definition

We now define our tour recommendation problem in the context of the Orienteering problem and its integer problem formulation [24, 42, 45]. Given the set of POIs P, a budget B, starting POI $p_1$ and destination POI $p_N$, our goal is to recommend an itinerary $I=(p_1, \ldots ,p_N)$ that maximizes a certain score S while adhering to the budget B.^{Footnote 4} In this case, the score S is represented by the popularity and user interest of the recommended POIs using the functions $\textit{Pop}(p)$ and $Int(\textit{Cat}_p)$, respectively. The budget B is based on time spent and calculated using the function $\textit{Cost}(p_x, p_y) = T^{\textit{Travel}}(p_x, p_y) + T^{\textit{Visit}}_u(p_y)$, i.e., using both traveling time and personalized visit duration at the POI. One main difference between our work and earlier work is that we personalize the visit duration at each recommended POI based on user interest (Definition 5), instead of using the average visit duration for all users or not considering visit duration at all. Formally, we want to find an itinerary $I = (p_1, \ldots ,p_N)$ that:

$$\begin{aligned} \textit{Max} \sum \limits _{i=2}^{N-1} \sum \limits _{j=2}^N x_{i,j} \Big ( \eta \textit{Int}(\textit{Cat}_i) + (1 - \eta ) \textit{Pop}(i) \Big ) \end{aligned}$$

(5)

where $x_{i,j}=1$ if both POI i and j are visited in sequence (i.e., we travel directly from POI i to j), and $x_{i,j}=0$ otherwise. We attempt to solve for Eq. 5, such that:

$$\begin{aligned}&\sum \limits _{j=2}^N x_{1,j} = \sum \limits _{i=1}^{N-1} x_{i,N} = 1 \end{aligned}$$

(6)

$$\begin{aligned}&\sum \limits _{i=1}^{N-1} x_{i,k} = \sum \limits _{j=2}^N x_{k,j} \le 1, \quad \forall ~k=2, \ldots ,N-1 \end{aligned}$$

(7)

$$\begin{aligned}&\sum \limits _{i=1}^{N-1} \sum \limits _{j=2}^N \textit{Cost}(i,j) x_{i,j} \le B \end{aligned}$$

(8)

$$\begin{aligned}&2 \le p_i \le N, \quad \forall ~i=2, \ldots ,N \end{aligned}$$

(9)

$$\begin{aligned}&p_i - p_j + 1 \le (N-1)(1-x_{i,j}), \quad \forall ~i,j=2, \ldots ,N \end{aligned}$$

(10)

Equation 5 is a multi-objective function that maximizes the popularity and interest of all visited POIs in the itinerary, where $\eta $ is the weighting given to the popularity and interest components. Equation 5 is also subject to constraints 6–10. Constraint 6 ensures that the itinerary starts at POI 1 and ends at POI N, while constraint 7 ensures that the itinerary is connected and no POIs are visited more than once. Constraint 8 ensures that the time taken for the itinerary is within the budget B, based on the function $\textit{Cost}(p_x, p_y)$ that considers both traveling time and personalized POI visit duration. Given that $p_x$ is the position of POI x in itinerary I, constraints 9 and 10 ensure that there are no sub-tours in the proposed solution, adapted from the sub-tour elimination used in the Traveling Salesman Problem [31].

Based on this problem definition, we can then proceed to solve our tour recommendation problem as an integer programming problem. For solving this integer programming problem, we used the lpsolve linear programming package [3]. We denote our proposed algorithm for personalized tour recommendation as PersTour and shall describe our overall framework and the different PersTour variants in the following section.

3.2.1 Adaptive weighting

As introduced in Eq. 5, the $\eta $ parameter offers tourists the flexibility to indicate their preferences for POI popularity and interest alignment. In this section, we propose two methods that automatically determine an appropriate value for the $\eta $ parameter based on the POI visits by the general user population.

Given all users U and their set of travel histories $S_{u \in U}$, we define the number of POI visit count for a user u as: $C_u = |S_u|$. Similarly, $C_{\textit{max}}$ denotes the maximum POI visit count out of all users $u \in U$. We determine the $\eta $ value (i.e., adaptive weighting) for a user u using the following two methods.

Adaptive weights based on scaling (PT-AS) This method determines the $\eta $ value for a user u as follows: $\eta = \frac{C_u}{C_{\textit{max}}}$. In short, we are scaling the POI visit count of a user u by the maximum POI visit count of all users.
Adaptive weights based on cumulative distribution (PT-AC) This method determines the $\eta $ value for a user u as follows: $\eta = P(C \le C_u)$. That is, we are building a probability distribution function based on all users’ POI visit counts, and then calculating the probability that a random variable C (i.e., the POI visit count) is less than or equal to the POI visit count of user u.

4 Tour recommendation framework

Figure 1 outlines our overall tour recommendation framework. This framework requires a list of POIs (with lat/long coordinates and POI categories) and a set of geo-tagged photographs (with lat/long coordinates and time taken), which can be easily obtained from Wikipedia and Flickr, respectively. Thereafter, the main steps in our framework are:

Step 1: Determine POI visits (map photographs to POIs) We first determine the POI visits in each city by mapping the set of geo-tagged photographs to the list of POIs. In particular, we map a photograph to a POI if their coordinates differ by <200m based on the Haversine formula [37], which is used for calculating spherical (earth) distances. If a photograph is within 200m of multiple POIs, we only map this photograph to the nearest POI, i.e., no photograph is mapped to multiple POIs.

Step 2: Construct travel history/sequences Based on the POI visits from Step 1, we can construct the travel history of each user by sorting their POI visits in ascending temporal order (Definition 1). Using each user’s travel history, we then proceed to group consecutive POI visits as an individual travel sequence, if the consecutive POI visits differ by <8 h (Definition 2). Thus, we are also able to determine the POI visit duration based on the time difference of the first and last photograph taken at each POI.

Step 3: Recommend tours using PersTour. As described in Sect. 3.2, there can be different variants of PersTour, based on the value of $\eta $ and the type of interest function chosen. The value of $\eta $ indicates the weight given to either POI popularity or user interest, while the interest function can be either frequency-based interest ($\textit{Int}^{\textit{Freq}}_u$), time-based interest ($\textit{Int}^{\textit{Time}}_u$) or time-based interest with weighted updates ($\textit{Int}^{\textit{Upd}}_u$). We experiment with the following variants:

PersTour using $\eta =0\,(PT-0)$ PersTour with full emphasis on optimizing POI popularity, ignoring user interest (i.e., no need to choose between ${\textit{Int}}^{\textit{Time}}_u$ or ${\textit{Int}}^{\textit{Freq}}_u$).
PersTour using $Int^Freq_u$ and $\eta =0.5$ $(PT-.5F)$ PersTour with balanced emphasis on optimizing both POI popularity and frequency-based user interest.
PersTour using $Int^Time_u$ and $\eta =0.5$ $(PT-.5T)$ PersTour with balanced emphasis on optimizing both POI popularity and time-based user interest.
PersTour using $Int^Upd_u$ and $\eta =0.5$ $(PT-.5U)$ PersTour with balanced emphasis on optimizing both POI popularity and time-based user interest with weighted updates.
PersTour using ${\textit{Int}}^{\textit{Freq}}_u$ and $\eta =1$ (PT-1F) PersTour with full emphasis on optimizing frequency-based user interest, ignoring POI popularity.
PersTour using $Int^Time_u$ and $\eta =1$ $(PT-1T)$ PersTour with full emphasis on optimizing time-based user interest, ignoring POI popularity.
PersTour using $Int^Upd_u$ and $\eta =1$ $(PT-1U)$ PersTour with full emphasis on optimizing time-based user interest with weighted updates, ignoring POI popularity.
PersTour using $Int^Upd_u$ and adaptive weighting $\eta $ by scaling $(PT-AS)$ PersTour with emphasis on optimizing both POI popularity and time-based user interest with weighted updates, where emphasis is based on adaptive weighting by scaling of POI visit counts.
PersTour using $Int^Upd_u$ and adaptive weighting $\eta $ by cumulative distribution $(PT-AC)$ PersTour with emphasis on optimizing both POI popularity and time-based user interest with weighted updates, where emphasis is based on adaptive weighting by cumulative distribution of POI visit counts.

These variants allow us to best evaluate the effects of different $\eta $ values and compare between frequency-based interest and time-based interest (with and without weighted updates). As PT-0 does not consider user interest, there is no need to choose between time-based or frequency-based user interest. The PT-0, PT-.5F, PT-.5T, PT-.5U and PT-1F, PT-1T, PT-1U algorithms allow us to investigate the effect of different emphasis on POI popularity and the different types of user interests, i.e., by adjusting the $\eta $ parameter. These algorithms offer tourists the flexibility to explicitly specify their preference between the two components of POI popularity and user interests. If the tourist prefers to determine this preference automatically, the PT-AS and PT-AC algorithms provide alternatives where this emphasis (i.e., the $\eta $ parameter) between the two components of POI popularity and user interests can be automatically learned.

5 Experimental methodology

In this section, we elaborate on the experimental dataset, baseline algorithms and evaluation metrics that are used for our experimental evaluation.

5.1 Dataset

For our experiments, we use the Yahoo! Flickr Creative Commons 100M (YFCC100M) dataset [41, 48], which consists of 100M Flickr photographs and videos. This dataset also comprises the meta information regarding the photographs, such as the date/time taken, geo-location coordinates and accuracy of these geo-location coordinates. The geo-location accuracy ranges from world level (least accurate) to street level (most accurate).

Table 1 Dataset description

Full size table

Using the YFCC100M dataset, we extracted geo-tagged photographs that were taken in ten different cities, namely Toronto, Osaka, Glasgow, Budapest, Perth, Vienna, Delhi, Edinburgh, Tokyo and London. To ensure the best accuracy and generalizability of our results, we only chose photographs with the highest geo-location accuracy and experimented on ten touristic cities around the world. A more detailed description of our dataset is shown in Table 1. This dataset is also publicly available at https://sites.google.com/site/limkwanhui/datacode#ijcai15.

5.2 Baseline algorithms

We compare our PersTour algorithms against five different baseline algorithms, which can be divided into two broad categories. The first category is based on the popular collaborative filtering (CF) recommender systems [34, 51, 52], which utilizes a user’s (tourist’s) rating on the items (POIs) to recommend a set of item for another user based on their user similarities. Based on two definitions of user ratings, we implemented two variations of CF-based baseline algorithms, namely

Collaborative filtering based on photographs uploaded $(CF-Pho)$ The user/tourist’s rating on each item/POI is based on the number of uploaded photographs of that particular POI he/she has uploaded, i.e., a higher number of uploaded photographs correspond to a higher rating for that POI.
Collaborative filtering based on POIs visited $(CF-Bin)$ The user/tourist’s rating on each item/POI is based on whether they have visited that particular POI, i.e., a binary rating of 1 (visited) or 0 (not visited).

As CF-based algorithms recommend the top-K individual POIs instead of an itinerary of connected POIs, we implemented additional processing steps to ensure a consistent output result for our tour recommendation problem. Based on a starting POI $p_1$ (like our PersTour algorithm), the CF-Pho and CF-Bin algorithms will iteratively add in the highest ranked POI from the top-K results, until either: (i) the budget B is exhausted or (ii) the destination POI $p_N$ is reached.

The second category of baseline algorithms is variations of greedy-based approaches that have also been used in other tour recommendation research [4, 6, 46]. Similar to our PersTour approach, these baseline algorithms commence from a starting POI $p_1$ and iteratively choose a next POI to visit until either: (i) the budget B is exhausted; or (ii) the destination POI $p_N$ is reached. The sequence of selected POIs thus forms the recommended itinerary, and the three greedy-based baselines are:

Greedy nearest (GNear) Chooses the next POI to visit by randomly selecting from the three nearest, unvisited POIs.
Greedy most popular (GPop) Chooses the next POI to visit by randomly selecting from the three most popular, unvisited POIs.
Random selection (Rand) Chooses the next POI to visit by randomly selecting from all unvisited POIs.

GNear and GPop are meant to reflect tourists’ behavior by visiting nearby and popular POIs, respectively, while Rand shows the performance of a random-based approach.

5.3 Evaluation

We evaluate PersTour and the baselines using leave-one-out cross-validation [18], i.e., when evaluating a specific travel sequence of a user, we use this user’s other travel sequences for training our algorithms. Specifically, we consider all real-life travel sequences with $\ge $3 POI visits and evaluate the algorithms using the starting POIs and destination POIs of these travel sequences. Thereafter, we evaluate the performance of each algorithm based on the recommended tour itinerary I using the following metrics:^{Footnote 5}

1.
Tour recall: $T_{R}(I)$ The proportion of POIs in a user’s real-life travel sequence that were also recommended in itinerary I. Let $P_r$ be the set of POIs recommended in itinerary I and $P_v$ be the set of POIs visited in the real-life travel sequence, tour recall is defined as: $T_{R}(I) = \frac{|P_r \cap P_v|}{|P_v|}$.
2.
Tour precision: $T_{P}(I)$ The proportion of POIs recommended in itinerary I that were also in a user’s real-life travel sequence. Let $P_r$ be the set of POIs recommended in itinerary I and $P_v$ be the set of POIs visited in the real-life travel sequence, tour precision is defined as: $T_{P}(I) = \frac{|P_r \cap P_v|}{|P_r|}$.
3.
Tour: $F_1$-score $T_{F_1}(I)$ The harmonic mean of both the recall and precision of a recommended tour itinerary I defined as: $T_{F_1}(I) = \frac{2 \times T_{P}(I) \times T_{R}(I)}{T_{P}(I)+T_{R}(I)}$.
4.
Root-mean-square error (RMSE) of POI visit duration: $T_{\textit{RMSE}}(I)$ The level of error between our recommended POI visit durations in itinerary I compared to the real-life POI visit durations taken by the users. Let $I^s \in I$ be the recommended POIs that were visited in real life,^{Footnote 6} and $D_r$ and $D_v$ be the recommended and real-life POI visit durations, respectively, RMSE is defined as: $T_{\textit{RMSE}}(I) = \sqrt{\frac{\sum _{p \in I^s} (D_r-D_v)^2}{|I^s|}}$.
5.
Tour popularity: $T_{\textit{Pop}}(I)$ The overall popularity of all POIs in the recommended itinerary I defined as: $T_{\textit{Pop}}(I) = \sum \limits _{p \in I} Pop(p)$.
6.
Tour interest: $T^u_{\textit{Int}}(I)$ The overall interest of all POIs in the recommended itinerary I to a user u defined as: $T^u_{\textit{Int}}(I) = \sum \limits _{p \in I} \textit{Int}_u(\textit{Cat}_p)$.
7.
Popularity and interest rank: $T^a_{\textit{Rk}}$ The average rank of an algorithm a based on its $T_{\textit{Pop}}$ and $T_{\textit{Int}}$ scores ranked against other algorithms (1 $=$ best, 12 $=$ worst).

We selected these metrics to better evaluate the following: (i) time-based versus frequency-based user interest, using Metrics 1–3; (ii) personalized versus non-personalized POI visit durations, using Metric 4; and (iii) PersTour versus baselines, using Metrics 5–7. As personalized POI visit durations only apply to PersTour and not the baselines, we only report $T_{\textit{RMSE}}$ scores for the PT-0, PT-.5F, PT-.5T, PT-.5U, PT-1F, PT-1T and PT-1U algorithms. Our baseline for comparing $T_{\textit{RMSE}}$ is variants of PersTour that use non-personalized POI visit durations, i.e., average POI visit durations.

6 Results and discussion

In this section, we discuss our experimental results and highlight our main findings.

6.1 Comparison between time-based and frequency-based user interest

Table 2 Comparison between time-based user interest (PT-.5T and PT-1T) and frequency-based user interest (PT-.5F and PT-1F), in terms of recall ($T_{R}$), precision ($T_{P}$) and F$_1$-score ($T_{F_1}$)

Full size table

Table 3 Comparison between time-based user interest (PT-.5T and PT-1T) and frequency-based user interest (PT-.5F and PT-1F), in terms of recall ($T_{R}$), precision ($T_{P}$) and F$_1$-score ($T_{F_1}$)

Full size table

Figure 2 presents an overview of results in terms of the average recall ($T_{R}$), precision ($T_{P}$) and F$_1$-score ($T_{F_1}$) across all ten cities, for the different variations of our PersTour algorithm and the baselines. The results show that all variants of PersTour out-perform the five baselines in terms of $T_{R}$ and $T_{F_1}$ scores. In terms of $T_{P}$ scores, the PersTour variants of PT-0, PT-.5T, PT-.5U and PT-.5F out-perform all baselines, while the CF-Pho and CF-Bin out-perform the PersTour variants of PT-1T, PT-1U and PT-1F. We now examine the performance of our PersTour algorithm and the baselines in greater detail.

Moving on to the results for individual cities, we study the performance difference between using time-based user interest and frequency-based user interest, as shown in Tables 2 and 3. Comparing the $T_{F_1}$ scores between PT-.5T, PT-.5U and PT-.5F, and between PT-1T, PT-1U and PT-1F, the results show that PersTour using time-based user interest (PT-.5T, PT-.5U, PT-1T and PT-1U) out-performs its counterpart that uses frequency-based user interest (PT-.5F and PT-1F), in most cases. This observation highlights the effectiveness of time-based user interest in recommending tours that more accurately reflect real-life tours of users, compared to using frequency-based user interest. While PT-1T and PT-1U under-perform PT-1F in terms of $T_{R}$ for some cities, we focus more on the $T_{F_1}$ scores as it provides a balanced representation of both $T_{R}$ and $T_{P}$. Moreover, for all cities, PT-.5T, PT-.5U, PT-1T and PT-1U (time-based user interest) result in higher $T_{P}$ scores, compared to its PT-.5F and PT-1F counterparts (frequency-based user interest). Another observation is that all PersTour variants also out-perform the five baselines for all cities, in terms of $T_{F_1}$ scores.

The reason for the more accurate recommendations of time-based user interest compared to frequency-based user interest is due to its use of POI visit durations instead of POI visit frequency. Figure 3 illustrates a toy example that highlights the difference between time-based user interest and frequency-based user interest. Consider user A who only visited two parks but spent three or more hours at each of them and user B who visited five parks but spent less than 15 minutes at each of them. Frequency-based interest incorrectly classifies user B as having more interest in the parks category due to his/her five visits, compared to user A’s two visits. On the other hand, time-based interest more accurately determines that user A has a higher interest in the parks category due to his/her long visit duration, despite user A only visiting two parks. Furthermore, time-based interest can more accurately capture a user’s level of interest based on how much longer this user spends at a POI compared to the average user (e.g., a user is more interested if he/she spends 3 h at a POI when the average time spent is only 30 minutes). With the availability of user interest levels, we can better personalize POI visit duration for each unique user, which we evaluate next.

6.2 Comparison between personalized and non-personalized visit durations

Table 4 Comparison between personalized and non-personalized visit durations, in terms of RMSE ($T_{\textit{RMSE}}$)

Full size table

Table 5 Comparison between personalized and non-personalized visit durations, in terms of RMSE ($T_{\textit{RMSE}}$)

Full size table

The $T_{\textit{RMSE}}$ scores in Tables 4 and 5 show that our recommendation of a personalized POI visit duration (Definitions 5 and 8) out-performs the non-personalized version in 62 out of 70 cases, based on a smaller error in the recommended POI visit durations. This result shows that personalizing POI visit duration using time-based user interests more accurately reflects the real-life POI visit duration of users, compared to the current standard of simply using average POI visit duration. Apart from recommending accurate POIs ($T_{F_1}$ scores), recommending the appropriate amount of time to spend at each POI is another important consideration in tour recommendation, which has not been explored in earlier works that also aim to recommend tour itineraries.

Previously, we have observed how time-based interest results in more accurate POI recommendations based on the $T_{F_1}$ scores. Our personalized POI visit duration builds upon this success by customizing the POI visit duration to each unique user based on his/her relative interest level, i.e., spend more time in a POI that interests the user and less time in a POI that the user is less interested in. Accurate POI visit durations have another important implication in tour recommendation, where spending less time at un-interesting POIs frees up the time budget for more visits to POIs that are more interesting to the user. Similarly, a user might prefer to spend more time visiting a few POIs of great interest, compared to visiting many POIs of less interest to the user.

6.3 Comparison of popularity and interest

We now present an overview of the results in terms of the average popularity ($T_{\textit{Pop}}$), interest ($T_{\textit{Int}}$) and rank ($T_{\textit{Rk}}$) score for all ten cities, as shown in Fig. 4. In particular, we are most interested in the $T_{\textit{Rk}}$ score that is derived from the average rank of an algorithm based on its $T_{\textit{Pop}}$ and $T_{\textit{Int}}$ scores, compared to other algorithms. For $T_{\textit{Rk}}$ scores, a value of 1 indicates the best performance, while 12 indicates the worst performance. Based on this $T_{\textit{Rk}}$ score, all variants of PersTour out-perform the five baselines, with PT-.5U being the best performer. We observe a similar performance in terms of $T_{\textit{Int}}$ scores, where all variants of PersTour out-performing the baselines. In terms of $T_{\textit{Pop}}$ scores, the PersTour variants of PT-0, PT-.5T, PT-.5U and PT-.5F out-perform all baselines, while PT-1T, PT-1U and PT-1F under-performs the GPop baseline. This performance is understandable as the PT-1T, PT-1U and PT-1F algorithms emphasize fully on user interest preferences, while the GPop baseline focuses on the most popular POIs thus maximizing the $T_{\textit{Pop}}$ scores for the latter. Next, we provide a more in-depth discussion of the performance among the various PersTour variants.

Based on the $T_{\textit{Rk}}$ scores in Tables 6 and 7, we observe that PT-.5U (time-based user interest with weighted updates) is overall the best performer, and PT-.5T (time-based user interest) is the second best performer.^{Footnote 7} In addition, we also observe that PT-1U (time-based user interest with weighted updates) out-performs its PT-1F counterpart (frequency-based user interest) for eight out of ten cities, with the same performance for the remaining two cities. These results show the benefits of applying weighted updates to user interests (PT-.5U and PT-1U), compared to simply using time-based user interest without weighted updates (PT-.5T and PT-1T).

Next, we examine how PersTour (with and without weighted updates) performs against the various baselines. Both PT-.5U and PT-.5T out-perform all baselines as well as its PT-.5F counterpart that uses frequency-based user interest. Similarly, PT-1T (time-based user interest) out-performs PT-1F (frequency-based user interest) for six out of ten cities, with the same performance for the remaining four cities. These results show the effectiveness of time-based user interest (both with and without weighted updates) over frequency-based user interest, based on the $T_{\textit{Rk}}$ scores.

The effects of the $\eta $ parameter can be observed in the $T_{\textit{Pop}}$ and $T_{\textit{Int}}$ scores. A value of $\eta =0$ (PT-0) results in the best performance in $T_{\textit{Pop}}$ and worst performance in $T_{\textit{Int}}$, while a value of $\eta =1$ (PT-1F, PT-1T and PT-1U) results in the opposite. While we include the $T_{\textit{Pop}}$ and $T_{\textit{Int}}$ scores for completeness, we are more interested in $T_{\textit{Rk}}$ as it gives a balanced measurement of both $T_{\textit{Pop}}$ and $T_{\textit{Int}}$.

Table 6 Comparison of PersTour (PT) against baselines, in terms of popularity ($T_{\textit{Pop}}$), interest ($T_{\textit{Int}}$) and rank ($T_{\textit{Rk}}$)

Full size table

Table 7 Comparison of PersTour (PT) against baselines, in terms of popularity ($T_{\textit{Pop}}$), interest ($T_{\textit{Int}}$) and rank ($T_{\textit{Rk}}$)

Full size table

6.4 Comparison of PersTour with adaptive weights

To evaluate the effectiveness of using adaptive weights, we compare PersTour without adaptive weights (PT-.5U and PT-1U) against PersTour with adaptive weights (PT-AS and PT-AC). We focus mainly on the top and bottom 15% of users of each city, based on their number of total POI visits. The reason for choosing these users is that adaptive weights are most beneficial to such outlier users as we can recommend more personalized tours to users with many POI visits and popular tours to users with little POI visits.

Our main evaluation metrics are the $T_{R}$, $T_{P}$ and $T_{F_1}$ scores as they indicate the effectiveness of adaptive weights in recommending tours that correspond to real-life visits. Table 8 shows that PT-AS has the overall best performance as indicated by the highest $T_{R}$, $T_{P}$ and $T_{F_1}$ scores for seven, five and six cities, respectively, out of all ten cities. These results show the effectiveness of implementing adaptive weights for different users, i.e., a different level of emphasis between POI popularity and user interest preferences for different users.

Table 8 Comparison between PersTour with weighted updates and PersTour with adaptive weightings, in terms of recall ($T_{R}$), precision ($T_{P}$) and F$_1$-score ($T_{F_1}$)

Full size table

7 Conclusion and future work

We modeled our tour recommendation problem based on the Orienteering problem and proposed the PersTour algorithm for recommending personalized tours. Our PersTour algorithm considers both POI popularity and user interest preferences to recommend suitable POIs to visit and the amount of time to spend at each POI. In addition, we implemented a framework where geo-tagged photographs can be used to automatically detect real-life travel sequences and determine POI popularity and user interest, which can then be used to train our PersTour algorithm. Our work improves upon earlier tour recommendation research in three main ways: (i) we introduce time-based user interest derived from a user’s visit durations at specific POIs relative to other users, instead of using a frequency-based user interest based on POI visit frequency; (ii) we personalize POI visit duration based on the relative interest levels of individual users, instead of using the average POI visit duration for all users or not considering POI visit duration at all; and (iii) we introduce two adaptive weighting methods to automatically determine the emphasis on POI popularity and user interest preferences.

Using a Flickr dataset across ten cities, we evaluate the effectiveness of our PersTour algorithm against various collaborative filtering and greedy-based baselines, in terms of tour popularity, interest, precision, recall, F$_1$-score and RMSE of visit duration. In particular, our experimental results show that: (i) using time-based user interest results in tours that more accurately reflect the real-life travel sequences of users, compared to using frequency-based user interest, based on precision and F$_1$-score; (ii) our personalized POI visit duration more accurately reflects the time users spend at POIs in real-life, compared to the current standard of using average visit duration, based on the RMSE of visit duration; (iii) PersTour and its variants out-perform all baselines in most cases, based on tour popularity, interest, precision, recall and F$_1$-score; and (iv) our adaptive weighting methods further improve the performance of PersTour, based on precision, recall and F$_1$-score.

In this work, we focused mainly on recommending tours that are personalized to individual users based on their time-based user interest. Some possible directions for future work are:

Modeling uncertainty in POI visit duration based on the day of the week and time of the day. The main consideration for this work is to incorporate some uncertainty in the amount of time recommended at various POIs due to delays caused by crowds (e.g., POIs are more crowded during weekends than weekdays, thus causing possible delays)
Recommending tour itineraries that utilize multiple types of transport (e.g., walking, bus, train, taxi, car), instead of a single type of transport. The main motivation of this future work would be to offer users the flexibility to switch between different modes of transport, while excluding certain types (e.g., either bus, train or taxi but no walking).
When using public transport (e.g., bus, train, tram), recommend tour itineraries that consider the arrival and departure times of public transport to minimize the waiting time by the tourists for their respective public transport to arrive. Furthermore, we can also model uncertainty in the arrival times, especially when there are connections between multiple transport modes.

Notes

We use the terms “tourist” and “user” interchangeably, and similarly for the terms “tour” and “trip.”
This publication is an extended version of Lim et al. [27] that appeared in IJCAI’15, with the additional contributions of Points 3, 5 and 7.
$T^{\textit{Travel}}(p_x, p_y)$ can be easily generalized to different transport modes (e.g., taxi, bus, train) and to also consider the traffic condition between POIs (e.g., longer travel times between two POIs in a congested city, compared to two equal-distanced POIs elsewhere).
Although we examine POIs in this work, our tour recommendation problem definition can be easily modified such that a recommended tour itinerary starts and ends at a specific hotel where the tourist is staying at.
Some metrics are rounded off to the same value, but are different values before rounding. The bold-faced values indicate the best performing metrics.
We can only compare POI visit durations for POIs in itinerary I that were “correctly” recommended (i.e., visited in real life).
PT-.5T out-performs PT-.5U in only one city, performs the same in five cities and under-performs in the remaining four cities.

References

Anagnostopoulos A, Atassi R, Becchetti L, Fazzone A, Silvestri F (2016) Tour recommendation for groups. Data Min Knowl Discov 1–32. doi:10.1007/s10618-016-0477-710
Baraglia R, Muntean CI, Nardini FM, Silvestri F (2013) Learnext: learning to predict tourists movements. In: Proceedings of the 22nd ACM international conference on information and knowledge management (CIKM’13), pp 751–756
Berkelaar M, Eikland K, Notebaert P (2004) lpsolve: open source (mixed-integer) linear programming system. http://lpsolve.sourceforge.net/
Brilhante I, Macedo JA, Nardini FM, Perego R, Renso C (2013) Where shall we go today? Planning touristic tours with TripBuilder. In: Proceedings of the 22nd ACM international conference on information and knowledge management (CIKM’13), pp 757–762
Brilhante I, Macedo JA, Nardini FM, Perego R, Renso C (2014) Tripbuilder: a tool for recommending sightseeing tours. In: Proceedings of the 36th European conference on information retrieval (ECIR’14), pp 771–774
Brilhante IR, Macedo JA, Nardini FM, Perego R, Renso C (2015) On planning sightseeing tours with tripbuilder. Inf Process Manag 51(2):1–15
Article Google Scholar
Castillo L, Armengol E, Onaindía E, Sebastiá L, González-Boticario J, Rodríguez A, Fernández S, Arias JD, Borrajo D (2008) SAMAP: an user-oriented adaptive system for planning tourist visits. Expert Syst Appl 34(2):1318–1332
Article Google Scholar
Chen C, Zhang D, Guo B, Ma X, Pan G, Wu Z (2015) TripPlanner: personalized trip planning leveraging heterogeneous crowdsourced digital footprints. IEEE Trans Intell Transp Syst 16(3):1259–1273
Article Google Scholar
Choudhury MD, Feldman M, Amer-Yahia S, Golbandi N, Lempel R, Yu C (2010) Automatic construction of travel itineraries using social breadcrumbs. In: Proceedings of the 21st ACM conference on hypertext and hypermedia (HT’10), pp 35–44
Cohen R, Katzir L (2008) The generalized maximum coverage problem. Inf Process Lett 108(1):15–22
Article MathSciNet MATH Google Scholar
Crandall DJ, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. Proc Natl Acad Sci 107(52):22436–22441
Article Google Scholar
Gavalas Damianos, Charalampos Konstantopoulos KMGP (2014) A survey on algorithmic approaches for solving tourist trip design problems. J Heuristics 20(3):291–328
Article Google Scholar
Gionis A, Lappas T, Pelechrinis K, Terzi E (2014) Customized tour recommendations in urban areas. In: Proceedings of the 7th ACM international conference on web search and data mining (WSDM’14), pp 313–322
Gunawan A, Lau HC, Vansteenwegen P (2016) Orienteering problem: a survey of recent variants, solution approaches and applications. Eur J Oper Res 255(2):315–332
Article MathSciNet MATH Google Scholar
Ji R, Xie X, Yao H, Ma W-Y (2009) Mining city landmarks from blogs by graph modeling. In: Proceedings of the 17th ACM international conference on multimedia (MM’09), pp 105–114
Kisilevich S, Mansmann F, Keim D (2010) P-dbscan: a density based clustering algorithm for exploration and analysis of attractive areas using collections of geo-tagged photos. In: Proceedings of the 1st international conference and exhibition on computing for geospatial research and application (COMGeo’10), p 38
Kofler C, Caballero L, Menendez M, Occhialini V, Larson M (2011) Near2me: an authentic and personalized social media-based recommender for travel destinations. In: Proceedings of the 3rd ACM SIGMM international workshop on social media (WSM’11), pp 47–52
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI’95), pp 1137–1145
Kurashima T, Iwata T, Irie G, Fujimura K (2010) Travel route recommendation using geotags in photo sharing sites. In: Proceedings of the 19th ACM international conference on information and knowledge management (CIKM’10), pp 579–588
Kurashima T, Iwata T, Irie G, Fujimura K (2013) Travel route recommendation using geotagged photos. Knowl Inf Syst 37(1):37–60
Article Google Scholar
Leung KW-T, Lee DL, Lee W-C (2011) CLR: a collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (SIGIR’11), pp 305–314
Li J, Qian X, Tang YY, Yang L, Mei T (2013) GPS estimation for places of interest from social users’ uploaded photos. IEEE Trans Multimed 15(8):2058–2071
Article Google Scholar
Li W, Eickhoff C, de Vries AP (2014) Geo-spatial domain expertise in microblogs. In: Proceedings of the 36th European conference on information retrieval (ECIR’14), pp 487–492
Lim KH (2015) Recommending tours and places-of-interest based on user interests from geo-tagged photos. In: Proceedings of the 2015 SIGMOD PhD symposium (SIGMOD’15), pp 33–38
Lim KH, Chan J, Karunasekera S, Leckie C (2017) Personalized itinerary recommendation with queuing time awareness. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (SIGIR’17)
Lim KH, Chan J, Leckie C, Karunasekera S (2015a) Improving location prediction using a social historical model with strict recency context. In: Proceedings of the 5th workshop on context-awareness in retrieval and recommendation (CaRR’15)
Lim KH, Chan J, Leckie C, Karunasekera S (2015b) Personalized tour recommendation based on user interests and points of interest visit durations. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence (IJCAI’15), pp 1778–1784
Lim KH, Chan J, Leckie C, Karunasekera S (2016) Towards next generation touring: personalized group tours. In: Proceedings of the 26th international conference on automated planning and scheduling (ICAPS’16), pp 412–420
Lim KH, Wang X, Chan J, Karunasekera S, Leckie C, Chen Y, Tan CL, Gao FQ, Wee TK (2016) PersTour: a personalized tour recommendation and planning system. In: Extended proceedings of the 27th ACM conference on hypertext and social media (HT’16)
Lucchese C, Perego R, Silvestri F, Vahabi H, Venturini R (2012) How random walks can help tourism. In: Proceedings of the 34th European conference on information retrieval (ECIR’12), pp 195–206
Miller CE, Tucker AW, Zemlin RA (1960) Integer programming formulation of traveling salesman problems. J ACM 7(4):326–329
Article MathSciNet MATH Google Scholar
Popescu A, Grefenstette G, Moëllic P-A (2009) Mining tourist information from user-supplied collections. In: Proceedings of the 18th ACM conference on information and knowledge management (CIKM’09), pp 1713–1716
Quercia D, Schifanella R, Aiello LM (2014) The shortest path to happiness: recommending beautiful, quiet, and happy routes in the city. In: Proceedings of the 25th ACM conference on hypertext and social media (HT’14), pp 116–125
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM conference on computer supported cooperative work (CSCW’94), pp 175–186
Schedl M, Hauger D, Schnitzer D (2012) A model for serendipitous music retrieval, In: Proceedings of the 2nd workshop on context-awareness in retrieval and recommendation (CaRR’12), pp 10–13
Shi Y, Serdyukov P, Hanjalic A, Larson M (2011) Personalized landmark recommendation based on geotags from photo sharing sites. In: Proceedings of the fifth international AAAI conference on weblogs and social media (ICWSM’11), pp 622–625
Sinnott RW (1984) Virtues of the Haversine. Sky Telesc 68(158):159
Souffriau W, Vansteenwegen P (2010) Tourist trip planning functionalities: state-of-the-art and future. In: Proceedings of the 10th international conference on web engineering (ICWE’10), pp 474–485
Spyrou E, Mylonas P (2016) A survey on Flickr multimedia research challenges. Eng Appl Artif Intell 51:71–91
Article Google Scholar
Sun Y, Fan H, Bakillah M, Zipf A (2015) Road-based travel recommendation using geo-tagged images. Comput Environ Urban Syst 53:110–122
Article Google Scholar
Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li L-J (2016) YFCC100M: the new data in multimedia research. Commun ACM 59(2):64–73
Article Google Scholar
Tsiligirides T (1984) Heuristic methods applied to orienteering. J Oper Res Soc 35(9):797–809
Article Google Scholar
Vansteenwegen P, Oudheusden DV (2007) The mobile tourist guide: an OR opportunity. OR Insight 20(3):21–27
Article Google Scholar
Vansteenwegen P, Souffriau W, Berghe GV, Oudheusden DV (2011) The city trip planner: an expert system for tourists. Expert Syst Appl 38(6):6540–6546
Article Google Scholar
Vansteenwegen P, Souffriau W, Oudheusden DV (2011) The orienteering problem: a survey. Eur J Oper Res 209(1):1–10
Article MathSciNet MATH Google Scholar
Wang X, Leckie C, Chan J, Lim KH, Vaithianathan T (2016) Improving personalized trip recommendation by avoiding crowds. In: Proceedings of the 25th ACM international conference on information and knowledge management (CIKM’16), pp 25–34
Wörndl W, Hefele A (2016) Generating paths through discovered places-of-interests for city trip planning. In: Information and communication technologies in tourism. Springer International Publishing, pp 441–453
Yahoo! Webscope (2014) Yahoo! Flickr creative Commons 100M dataset (YFCC-100M). http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&did=67
Yamasaki T, Gallagher A, Chen T (2013) Personalized intra-and inter-city travel recommendation using large-scale geotags. In: Proceedings of the 2nd ACM international workshop on geotagging and its applications in multimedia (GeoMM’13), pp 25–30
Yao L, Sheng QZ, Qin Y, Wang X, Shemshadi A, He Q (2015) Context-aware point-of-interest recommendation using tensor factorization with social regularization. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (SIGIR’15), pp 1007–1010
Ye M, Yin P, Lee W-C, Lee D-L (2011) Exploiting geographical influence for collaborative point-of-interest recommendation. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (SIGIR’11), pp 325–334
Yuan Q, Cong G, Ma Z, Sun A, Thalmann NM (2013) Time-aware point-of-interest recommendation. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval (SIGIR’13), pp 363–372

Download references

Acknowledgements

This work was supported in part by Data61. We thank the anonymous reviewers for their useful comments and suggestions.

Author information

Authors and Affiliations

Department of Computing and Information Systems, University of Melbourne, Parkville, Australia
Kwan Hui Lim, Jeffrey Chan, Christopher Leckie & Shanika Karunasekera
Data61, CSIRO, Canberra, Australia
Kwan Hui Lim
School of Science, RMIT University, Melbourne, Australia
Jeffrey Chan

Authors

Kwan Hui Lim
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Chan
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Leckie
View author publications
You can also search for this author in PubMed Google Scholar
Shanika Karunasekera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kwan Hui Lim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, K.H., Chan, J., Leckie, C. et al. Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency. Knowl Inf Syst 54, 375–406 (2018). https://doi.org/10.1007/s10115-017-1056-y

Download citation

Received: 15 December 2015
Revised: 24 March 2017
Accepted: 19 April 2017
Published: 05 May 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s10115-017-1056-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency

Abstract

Similar content being viewed by others

PoI Recommendation System: A Blended Approach

Top-k Context-Aware Tour Recommendations for Groups

A personalized POI route recommendation system based on heterogeneous tourism data and sequential pattern mining

Explore related subjects

1 Introduction

2 Related work

2.1 Background on the Orienteering problem

2.2 Tour recommendation based on Orienteering problem and its variants

2.3 Tour recommendation based on other combinatorial optimization problems

2.4 Top-k POI recommendation and next-location prediction

2.5 Other tourism-related work

2.6 Discussion of differences with previous work

3 Background and Problem definition

3.1 Preliminaries

Definition 1

Definition 2

Definition 3

Definition 4

Definition 5

Definition 6

Definition 7

Definition 8

3.2 Problem definition

3.2.1 Adaptive weighting

4 Tour recommendation framework

5 Experimental methodology

5.1 Dataset

5.2 Baseline algorithms

5.3 Evaluation

6 Results and discussion

6.1 Comparison between time-based and frequency-based user interest

6.2 Comparison between personalized and non-personalized visit durations

6.3 Comparison of popularity and interest

6.4 Comparison of PersTour with adaptive weights

7 Conclusion and future work

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation