1 Introduction

Tour recommendation and itinerary planning are challenging tasks due to the different interest preferences and trip constraints (e.g., time limits, start and end points) of each unique tourist.Footnote 1 While there is an abundance of information from the Internet and travel guides, many of these resources simply recommend individual points of interest (POI) that are deemed to be popular, but otherwise do not appeal to the interest preferences of users or adhere to their trip constraints. Furthermore, the massive volume of information makes it a challenge for tourists to narrow down to a potential set of POIs to visit in an unfamiliar city. Even after the tourist finds a suitable set of POIs to visit, it will take considerable time and effort for the tourist to plan the appropriate duration of visit at each POI and the order in which to visit the POIs.

To address these issues, we propose the PersTour algorithm for recommending personalized tours where the suggested POIs are optimized to the users’ interest preferences and POI popularity. We formulate our tour recommendation problem based on the Orienteering problem [42], which considers a user’s trip constraints such as time limitations and the need for the tour to start and end at specific POIs (e.g., POIs near the tourist’s hotel). Using geo-tagged photographs as a proxy for tourist visits, we are able to extract real-life user travel histories, which can then be used to automatically determine a user’s interest level in various POI categories (e.g., parks, beaches, shopping) as well as the popularity of individual POIs. As tourists have different preference levels between POI popularity and POI relevance to their interests, our PersTour algorithm also allows tourists to indicate their preferred level of trade-off between POI popularity and his/her interest preferences. In cases where the tourist prefers to automate the indication of this trade-off between POI popularity and interest preference, PersTour is also able to determine the appropriate trade-off based on the activity level of the tourist relative to the POI visits of the general population.

Fig. 1
figure 1

Tour recommendation framework

Our main contributionsFootnote 2 are as follows:

  1. 1.

    We propose the PersTour algorithm for recommending personalized tour/trip itineraries with POIs and visit duration based on POI popularity, users’ interest preferences and trip constraints. Our tour recommendation problem is modeled in the context of the Orienteering problem (Sect. 3).

  2. 2.

    We introduce the concept of time-based user interest for tour recommendation, where a user’s level of interest in a POI category is based on his/her time spent at such POIs, relative to the average user. We also compare our time-based user interest to the current practice of using frequency-based user interest and show how time-based user interest results in recommended tours that more accurately reflect real-life travel sequences (Sect. 3.1).

  3. 3.

    We also further enhance time-based user interest by implementing an update rule such that user interests are refined based on the recency of their past POI visits. This updating works by giving more emphasis to recent POI visits than those in the more distant past (Sect. 3.1).

  4. 4.

    We demonstrate the personalization of POI visit duration using time-based user interest, for the purpose of tour/trip itinerary recommendation. Our results show that personalized visit durations more accurately reflect the real-life POI visit durations of users, compared to the current practice of using average visit duration (Sect. 3.1).

  5. 5.

    While the PersTour algorithm gives tourists the flexibility to indicate their preferred weightage between POI popularity and his/her interests, we also propose two schemes to automatically determine an appropriate weightage based on the tourist’s activity level, relative to the general tourist population (Sect. 3.2.1).

  6. 6.

    We implement a framework (Fig. 1) for extracting real-life user travel histories based on their geo-tagged photographs, which are then used for training our PersTour algorithm and serve as ground truth for our subsequent evaluation (Sect. 4).

  7. 7.

    We evaluate different variants of PersTour against various baselines using a Flickr dataset spanning ten cities. Our results show that PersTour out-performs these baselines based on tour popularity, user interest, recall, precision and F\(_1\)-score (Sects. 5 and 6).

The rest of the paper is structured as follows: Sect. 2 discusses some related work in tour recommendation; Sect. 3 introduces some preliminaries and defines our research problem; Sect. 4 describes our overall framework for tour recommendation; Sect. 5 outlines our experimental methodology; Sect. 6 discusses our main results and key findings; and Sect. 7 summarizes and concludes our paper.

2 Related work

Tour recommendation has been a well-studied field, with many developed applications [5, 7, 29, 43, 47] and research ranging from recommending beautiful, quiet and happy tours [33] to tour recommendation using random walks with restart [30]. In our review of related work, we focus on research related to our work and refer readers to [12, 38] for an overview on the general field of tour recommendation. In the following sections, we provide an overview of the Orienteering problem before highlighting some key works in tour itinerary recommendations.

2.1 Background on the Orienteering problem

The Orienteering problem [42] originated from a competition of the same name. In this Orienteering competition, there are multiple navigational checkpoints distributed throughout an area, where each checkpoint is associated with a certain score. The main objective of participants in this competition is to maximize their total score, which is accumulated from visiting the various checkpoints. Participants are only given a limited amount of time to maximize their scores, and the winner is the participant who has accumulated the highest score. Due to this limitation of time, participants have to strategize and select a smaller subset of checkpoints to visit and decide on the sequence to visit these checkpoints. For a more in-depth review of the Orienteering problem, we refer readers to [14, 45]. In recent years, various works have used the Orienteering problem to model different variations of the tour recommendation problem, and we discuss some of these works.

2.2 Tour recommendation based on Orienteering problem and its variants

Many tour itinerary recommendation works are based on the Orienteering problem and its variants. For example, Choudhury et al. [9] was one of the earlier tour recommendation studies based on the Orienteering problem, where recommended tours start and end at specific POIs while trying to maximize an objective score. Using a modified Orienteering problem, Gionis et al. [13] utilized POI categories such that recommended tours are constrained by a POI category visit order (e.g., museum \(\rightarrow \) park \(\rightarrow \) beach). Similarly, Lim [24] used a modified Orienteering problem constrained by a mandatory POI category, which corresponds to the POI category a user is most interested in. Based on user-indicated interests and trip constraints (e.g., time budget, start and end locations), Vansteenwegen et al. [44] recommended tours comprising POI categories that best match user interests while adhering to these trip constraints. In the context of theme parks, Lim et al. [25] recommended personalized itineraries with minimal queuing times at attractions, while maximizing user interests and attraction popularity. Others like [1, 28] have extended the Orienteering problem for the purpose of recommending tour itineraries for groups of tourists, with the aim of satisfying the diverse interest preferences of multiple tourists in a group.

2.3 Tour recommendation based on other combinatorial optimization problems

In contrast to works based on the Orienteering problem, there are also various tour itinerary recommendation works based on other combinatorial optimization and similar problems. For example, Brilhante et al. [4] formulated tour recommendation as a Generalized Maximum Coverage problem [10], with the objective of finding an optimal set of POIs based on both POI popularity and user interest. Thereafter, Brilhante et al. [6] extended upon the former by using a variation of the Traveling Salesman Problem, with the main aim of finding the shortest route among the set of optimal POIs recommended in [4]. In addition to user interests in tour recommendation, Chen et al. [8] also considered traveling times based on different traffic conditions, using trajectory patterns derived from taxi GPS traces. Focusing on traveling paths based on road segments between POIs, Sun et al. [40] recommended tour itineraries comprising popular POIs and interesting routes between these POIs, with POI and route popularity based on geo-tagged photographs. With further considerations for different transport modes, Kurashima et al. [19, 20] used a combined topic and Markov model to recommend tours based on both user interests and frequently traveled routes.

2.4 Top-k POI recommendation and next-location prediction

The recommendation of top-k POIs and next-location predictions is also closely related to our problem of tour itinerary recommendation. For example, LearNext [2] used Gradient Boosted Regression Trees and Ranking SVMs to predict the (single) next POI that a tourist will visit, while [49] performed a similar next-location prediction task using Markov models, along with seasonal and temporal information. Others like [36] used a category-regularized matrix factorization approach for recommending individual POIs, and Kofler et al. [17] proposed a system prototype for recommending individual POIs that are niche and specialized in nature. For top-k POI recommendations, many works utilized variants of matrix factorization or collaborative filtering approaches to recommend a ranked list of k POIs, using information such as friendship links [50], types of activities/users [21] and temporal patterns in POI visits [52].

2.5 Other tourism-related work

There are also many interesting tourism-related studies that utilize geo-tagged photographs for purposes ranging from identifying popular POIs to analyzing tourist behavior. For example, Ji et al. [15] implemented a graph modeling framework to identify popular POIs based on photographs posted in blogs, while [32] used geo-tagged photographs to understand tourist behavior based on their POI visit patterns and time spent. More generally, geo-tagged photographs have been used for other purposes such as predicting friendship relationships based on spatiotemporal links [11], identifying local clusters of interesting events and places [16] and estimating the location where a photograph is taken [22]. For a more comprehensive discussion of research that utilizes geo-tagged photographs, we direct readers to [39], who presented a comprehensive review of current applications and identified various interesting future directions.

2.6 Discussion of differences with previous work

While these previous works are the state of the art in tourism-related research, our proposed work differs from these earlier works in various aspects. First, we automatically derive a relative measure of time-based user interest using a user’s visit durations at POIs of a specific category, relative to the average visit durations of other users, whereas earlier tour recommendation works either use frequency-based user interest (based on POI visit frequency) or require users to explicitly indicate their interest preferences for tour itinerary recommendation. Second, we plan and recommend tour itineraries with personalized POI visit durations that cater to individual users based on their time-based user interests, whereas previous works recommend tour itineraries using the same non-personalized POI visit duration for all users (either the average duration or a fixed duration, e.g., 1 h at all POIs) or do not consider POI visit duration at all. Third, although the works on top-k POI recommendation and next-location prediction are related to our tour itinerary recommendation problem, our proposed problem involves the additional considerations of user interest preferences, POI popularity, time constraints, starting/ending locations and more importantly, recommending a connected tour itinerary that satisfies these considerations, instead of individual POIs. While the other tourism-related works illustrate many interesting applications of geo-tagged photographs, these works use such photographs to study tourist behavior and identify popular POIs, which are distinctly different from the task of recommending a personalized tour itinerary.

3 Background and Problem definition

In this section, we first examine some preliminary definitions, before introducing a formulation of our tour recommendation problem.

3.1 Preliminaries

If there are m POIs for a particular city, let \(P = \{p_1, \ldots ,p_m\}\) be the set of POIs in that city. Each POI p is also labeled with a category \(\textit{Cat}_p\) (e.g., church, park, beach) and latitude/longitude coordinates. We denote a function \(\textit{Pop}(p)\) that indicates the popularity of a POI p, based on the number of times POI p has been visited. Similarly, the function \(T^{\textit{Travel}}(p_x, p_y)\) measures the time needed to travel from POI \(p_x\) to \(p_y\), based on the distance between POIs \(p_x\) and \(p_y\) and the indicated traveling speed. For simplicity, we use a traveling speed of 4 km/h, i.e., a leisure walking speed.Footnote 3

Definition 1

Travel history Given a user u who has visited n POIs, we define his/her travel history as an ordered sequence, \(S_u = ((p_1, t^a_{p_1}, t^d_{p_1}), \ldots ,(p_n, t^a_{p_n}, t^d_{p_n}))\), with each triplet \((p_x, t^a_{p_x}, t^d_{p_x})\) comprising the visited POI \(p_x\), and the arrival time \(t^a_{p_x}\) and departure time \(t^d_{p_x}\) at POI \(p_x\). Thus, the visit duration at POI \(p_x\) can be determined by the difference between \(t^a_{p_x}\) and \(t^d_{p_x}\). Similarly, for a travel sequence \(S_u\), \(t^a_{p_1}\) and \(t^d_{p_n}\) also indicate the start and end time of the itinerary, respectively. For brevity, we simplify \(S_u = ((p_1, t^a_{p_1}, t^d_{p_1}), \ldots ,(p_n, t^a_{p_n}, t^d_{p_n}))\) as \(S_u = (p_1, \ldots ,p_n)\).

Definition 2

Travel sequence Based on the travel history \(S_u\) of a user u, we can further divide this travel history into multiple travel sequences, i.e., sub-sequences of \(S_u\). We divide a travel history \(S_u\) into separate travel sequences if \(t^d_{p_x} - t^a_{p_x+1} > \tau \). That is, we separate a travel history into distinct travel sequences if the consecutive POI visits occur more than \(\tau \) time units apart. Similar to other works [9, 24], we choose \(\tau =8~h\) in our experiments. These travel sequences also serve as the ground truth of real-life user trajectories, which are subsequently used for evaluating our PersTour algorithm and baselines. For a user u with n travel sequences, we use \(S^1_u, S^2_u, \ldots ,S^n_u\) to denote the different travel sequences in temporal order, such that \(S^1_u\) took place before \(S^2_u\).

Definition 3

Average POI visit duration Given a set of travel histories for all users U, we determine the average visit duration for a POI p as follows:

$$\begin{aligned} {\bar{V}}(p) = \frac{1}{n} {\sum \limits _{u \in U}} {\sum \limits _{p_x \in S_u} \left( t^d_{p_x} - t^a_{p_x}\right) } \delta (p_x=p), \quad \forall ~p \in P \end{aligned}$$
(1)

where n is the number of visits to POI p by all users and \(\delta (p_x=p)=\{^{1,~ p_x=p}_{0,~ \hbox {otherwise}}\). \({\bar{V}}(p)\) is commonly used in tour recommendation as the POI visit duration for all users [4, 6, 8], while many earlier works do not factor in POI visit durations at all. In our work, we show how recommended POI visit durations can be personalized to individual users based on their interest (Definition 5), and use \({\bar{V}}(p)\) as a comparison baseline, i.e., the non-personalized POI visit duration.

Definition 4

Time-based user interest As described earlier, the category of a POI p is denoted \(\textit{Cat}_p\). Given that C represents the set of all POI categories, we determine the interest of a user u in POI category c as follows:

$$\begin{aligned} \textit{Int}^{\textit{Time}}_u(c) = \sum \limits _{p_x \in S_u} \frac{\left( t^d_{p_x} - t^a_{p_x}\right) }{{\bar{V}}(p_x)} \delta (\textit{Cat}_{p_x}=c), \quad \forall ~c \in C \end{aligned}$$
(2)

where \(\delta (\textit{Cat}_{p_x}=c)=\{^{1,~ \textit{Cat}_{p_x}=c}_{0,~ \hbox {otherwise}}\). In short, Eq. 2 determines the interest of a user u in a particular POI category c, based on his/her time spent at each POI of category c, relative to the average visit duration (of all users) at the same POI. The rationale is that a user is likely to spend more time at a POI that he/she is interested in. Thus, by calculating how much more (or less) time a user is spending at POIs of a certain category compared to the average user, we can determine the interest level of this user in POIs of this category.

Definition 5

Personalized POI visit duration Based on our definition of time-based user interest (Eq. 2), we are able to personalize the recommended visit duration at each POI based on each user’s interest level. We determine the personalized visit duration at a POI p for a user u as follows:

$$\begin{aligned} T^{\textit{Visit}}_u(p) = {\textit{Int}^{\textit{Time}}_u(\textit{Cat}_p)} \times {{\bar{V}}(p)} \end{aligned}$$
(3)

That is, we are recommending a personalized POI visit duration based on user u’s relative interest level in category \(\textit{Cat}_p\) multiplied by the average time spent at POI p. Thus, if a user is more (less) interested in category \(\textit{Cat}_p\), he/she will spend more (less) time at POI p than the average user.

Definition 6

Frequency-based user interest We also define a simplified version of user interest, denoted \(\textit{Int}^{\textit{Freq}}_u(c)\), which is based on the number of times a user visits POIs of a certain category c (i.e., the more times a user visits POIs of a specific category, the more interested this user is in that category). As using \(\textit{Int}^{\textit{Freq}}_u(c)\) is the current practice in tour recommendation research [4, 6, 24], we include it for a more complete study and as a comparison baseline to our proposed \(\textit{Int}^{\textit{Time}}_u(c)\).

Definition 7

Time-based user interest with weighted updates We improve upon the original time-based user interest (Definition 4) by giving more emphasis to recent POI visits and less emphasis to POI visits in the distant past. Algorithm 1 details our proposed algorithm. In Line 9 of Algorithm 1, we continuously update user u’s interest by minimizing the error between his/her recommended and actual POI visit duration, while \(\frac{i}{n}\) ensures that more emphasis is given to more recent POI visits. Lines 6 to 8 calculate the error between the recommended and actual POI visit duration, while Lines 4 and 5 ensure that we perform this update for all POIs in all travel sequences of user u.

figure e

The intuition behind Algorithm 1 is that more recent POI visits are more relevant to a user and thus should contribute more to the modeling of this user’s interest. Similarly, other researchers have also observed people’s preference for more recent activities/information and utilized this recency preference for next check-in location prediction [26], location-based domain expert identification [23] and personalized music recommendation [35].

Definition 8

Personalized POI visit duration with weighted updates Similar to Definition 5, we can then recommend a personalized POI visit duration to POI p for a user u based on his/her time-based user interest with weighted updates, as follows:

$$\begin{aligned} T^{\textit{VisitUpd}}_u(p) = {\textit{Int}^{\textit{Upd}}_u(\textit{Cat}_p)} \times {{\bar{V}}(p)} \end{aligned}$$
(4)

Similar to Definition 5, we are personalizing the POI visit duration for user u based on his/her updated interest level in category \(\textit{Cat}_p\) multiplied by the average time that users spend at POI p.

3.2 Problem definition

We now define our tour recommendation problem in the context of the Orienteering problem and its integer problem formulation [24, 42, 45]. Given the set of POIs P, a budget B, starting POI \(p_1\) and destination POI \(p_N\), our goal is to recommend an itinerary \(I=(p_1, \ldots ,p_N)\) that maximizes a certain score S while adhering to the budget B.Footnote 4 In this case, the score S is represented by the popularity and user interest of the recommended POIs using the functions \(\textit{Pop}(p)\) and \(Int(\textit{Cat}_p)\), respectively. The budget B is based on time spent and calculated using the function \(\textit{Cost}(p_x, p_y) = T^{\textit{Travel}}(p_x, p_y) + T^{\textit{Visit}}_u(p_y)\), i.e., using both traveling time and personalized visit duration at the POI. One main difference between our work and earlier work is that we personalize the visit duration at each recommended POI based on user interest (Definition 5), instead of using the average visit duration for all users or not considering visit duration at all. Formally, we want to find an itinerary \(I = (p_1, \ldots ,p_N)\) that:

$$\begin{aligned} \textit{Max} \sum \limits _{i=2}^{N-1} \sum \limits _{j=2}^N x_{i,j} \Big ( \eta \textit{Int}(\textit{Cat}_i) + (1 - \eta ) \textit{Pop}(i) \Big ) \end{aligned}$$
(5)

where \(x_{i,j}=1\) if both POI i and j are visited in sequence (i.e., we travel directly from POI i to j), and \(x_{i,j}=0\) otherwise. We attempt to solve for Eq. 5, such that:

$$\begin{aligned}&\sum \limits _{j=2}^N x_{1,j} = \sum \limits _{i=1}^{N-1} x_{i,N} = 1 \end{aligned}$$
(6)
$$\begin{aligned}&\sum \limits _{i=1}^{N-1} x_{i,k} = \sum \limits _{j=2}^N x_{k,j} \le 1, \quad \forall ~k=2, \ldots ,N-1 \end{aligned}$$
(7)
$$\begin{aligned}&\sum \limits _{i=1}^{N-1} \sum \limits _{j=2}^N \textit{Cost}(i,j) x_{i,j} \le B \end{aligned}$$
(8)
$$\begin{aligned}&2 \le p_i \le N, \quad \forall ~i=2, \ldots ,N \end{aligned}$$
(9)
$$\begin{aligned}&p_i - p_j + 1 \le (N-1)(1-x_{i,j}), \quad \forall ~i,j=2, \ldots ,N \end{aligned}$$
(10)

Equation 5 is a multi-objective function that maximizes the popularity and interest of all visited POIs in the itinerary, where \(\eta \) is the weighting given to the popularity and interest components. Equation 5 is also subject to constraints 610. Constraint 6 ensures that the itinerary starts at POI 1 and ends at POI N, while constraint 7 ensures that the itinerary is connected and no POIs are visited more than once. Constraint 8 ensures that the time taken for the itinerary is within the budget B, based on the function \(\textit{Cost}(p_x, p_y)\) that considers both traveling time and personalized POI visit duration. Given that \(p_x\) is the position of POI x in itinerary I, constraints 9 and 10 ensure that there are no sub-tours in the proposed solution, adapted from the sub-tour elimination used in the Traveling Salesman Problem [31].

Based on this problem definition, we can then proceed to solve our tour recommendation problem as an integer programming problem. For solving this integer programming problem, we used the lpsolve linear programming package [3]. We denote our proposed algorithm for personalized tour recommendation as PersTour and shall describe our overall framework and the different PersTour variants in the following section.

3.2.1 Adaptive weighting

As introduced in Eq. 5, the \(\eta \) parameter offers tourists the flexibility to indicate their preferences for POI popularity and interest alignment. In this section, we propose two methods that automatically determine an appropriate value for the \(\eta \) parameter based on the POI visits by the general user population.

Given all users U and their set of travel histories \(S_{u \in U}\), we define the number of POI visit count for a user u as: \(C_u = |S_u|\). Similarly, \(C_{\textit{max}}\) denotes the maximum POI visit count out of all users \(u \in U\). We determine the \(\eta \) value (i.e., adaptive weighting) for a user u using the following two methods.

  • Adaptive weights based on scaling (PT-AS) This method determines the \(\eta \) value for a user u as follows: \(\eta = \frac{C_u}{C_{\textit{max}}}\). In short, we are scaling the POI visit count of a user u by the maximum POI visit count of all users.

  • Adaptive weights based on cumulative distribution (PT-AC) This method determines the \(\eta \) value for a user u as follows: \(\eta = P(C \le C_u)\). That is, we are building a probability distribution function based on all users’ POI visit counts, and then calculating the probability that a random variable C (i.e., the POI visit count) is less than or equal to the POI visit count of user u.

4 Tour recommendation framework

Figure 1 outlines our overall tour recommendation framework. This framework requires a list of POIs (with lat/long coordinates and POI categories) and a set of geo-tagged photographs (with lat/long coordinates and time taken), which can be easily obtained from Wikipedia and Flickr, respectively. Thereafter, the main steps in our framework are:

Step 1: Determine POI visits (map photographs to POIs) We first determine the POI visits in each city by mapping the set of geo-tagged photographs to the list of POIs. In particular, we map a photograph to a POI if their coordinates differ by <200m based on the Haversine formula [37], which is used for calculating spherical (earth) distances. If a photograph is within 200m of multiple POIs, we only map this photograph to the nearest POI, i.e., no photograph is mapped to multiple POIs.

Step 2: Construct travel history/sequences Based on the POI visits from Step 1, we can construct the travel history of each user by sorting their POI visits in ascending temporal order (Definition 1). Using each user’s travel history, we then proceed to group consecutive POI visits as an individual travel sequence, if the consecutive POI visits differ by <8 h (Definition 2). Thus, we are also able to determine the POI visit duration based on the time difference of the first and last photograph taken at each POI.

Step 3: Recommend tours using PersTour. As described in Sect. 3.2, there can be different variants of PersTour, based on the value of \(\eta \) and the type of interest function chosen. The value of \(\eta \) indicates the weight given to either POI popularity or user interest, while the interest function can be either frequency-based interest (\(\textit{Int}^{\textit{Freq}}_u\)), time-based interest (\(\textit{Int}^{\textit{Time}}_u\)) or time-based interest with weighted updates (\(\textit{Int}^{\textit{Upd}}_u\)). We experiment with the following variants:

  • PersTour using \(\eta =0\,(PT-0)\) PersTour with full emphasis on optimizing POI popularity, ignoring user interest (i.e., no need to choose between \({\textit{Int}}^{\textit{Time}}_u\) or \({\textit{Int}}^{\textit{Freq}}_u\)).

  • PersTour using \(Int^Freq_u\) and \(\eta =0.5\) \((PT-.5F)\) PersTour with balanced emphasis on optimizing both POI popularity and frequency-based user interest.

  • PersTour using \(Int^Time_u\) and \(\eta =0.5\) \((PT-.5T)\) PersTour with balanced emphasis on optimizing both POI popularity and time-based user interest.

  • PersTour using \(Int^Upd_u\) and \(\eta =0.5\) \((PT-.5U)\) PersTour with balanced emphasis on optimizing both POI popularity and time-based user interest with weighted updates.

  • PersTour using \({\textit{Int}}^{\textit{Freq}}_u\) and \(\eta =1\) (PT-1F) PersTour with full emphasis on optimizing frequency-based user interest, ignoring POI popularity.

  • PersTour using \(Int^Time_u\) and \(\eta =1\) \((PT-1T)\) PersTour with full emphasis on optimizing time-based user interest, ignoring POI popularity.

  • PersTour using \(Int^Upd_u\) and \(\eta =1\) \((PT-1U)\) PersTour with full emphasis on optimizing time-based user interest with weighted updates, ignoring POI popularity.

  • PersTour using \(Int^Upd_u\) and adaptive weighting \(\eta \) by scaling \((PT-AS)\) PersTour with emphasis on optimizing both POI popularity and time-based user interest with weighted updates, where emphasis is based on adaptive weighting by scaling of POI visit counts.

  • PersTour using \(Int^Upd_u\) and adaptive weighting \(\eta \) by cumulative distribution \((PT-AC)\) PersTour with emphasis on optimizing both POI popularity and time-based user interest with weighted updates, where emphasis is based on adaptive weighting by cumulative distribution of POI visit counts.

These variants allow us to best evaluate the effects of different \(\eta \) values and compare between frequency-based interest and time-based interest (with and without weighted updates). As PT-0 does not consider user interest, there is no need to choose between time-based or frequency-based user interest. The PT-0, PT-.5F, PT-.5T, PT-.5U and PT-1F, PT-1T, PT-1U algorithms allow us to investigate the effect of different emphasis on POI popularity and the different types of user interests, i.e., by adjusting the \(\eta \) parameter. These algorithms offer tourists the flexibility to explicitly specify their preference between the two components of POI popularity and user interests. If the tourist prefers to determine this preference automatically, the PT-AS and PT-AC algorithms provide alternatives where this emphasis (i.e., the \(\eta \) parameter) between the two components of POI popularity and user interests can be automatically learned.

5 Experimental methodology

In this section, we elaborate on the experimental dataset, baseline algorithms and evaluation metrics that are used for our experimental evaluation.

5.1 Dataset

For our experiments, we use the Yahoo! Flickr Creative Commons 100M (YFCC100M) dataset [41, 48], which consists of 100M Flickr photographs and videos. This dataset also comprises the meta information regarding the photographs, such as the date/time taken, geo-location coordinates and accuracy of these geo-location coordinates. The geo-location accuracy ranges from world level (least accurate) to street level (most accurate).

Table 1 Dataset description

Using the YFCC100M dataset, we extracted geo-tagged photographs that were taken in ten different cities, namely Toronto, Osaka, Glasgow, Budapest, Perth, Vienna, Delhi, Edinburgh, Tokyo and London. To ensure the best accuracy and generalizability of our results, we only chose photographs with the highest geo-location accuracy and experimented on ten touristic cities around the world. A more detailed description of our dataset is shown in Table 1. This dataset is also publicly available at https://sites.google.com/site/limkwanhui/datacode#ijcai15.

5.2 Baseline algorithms

We compare our PersTour algorithms against five different baseline algorithms, which can be divided into two broad categories. The first category is based on the popular collaborative filtering (CF) recommender systems [34, 51, 52], which utilizes a user’s (tourist’s) rating on the items (POIs) to recommend a set of item for another user based on their user similarities. Based on two definitions of user ratings, we implemented two variations of CF-based baseline algorithms, namely

  • Collaborative filtering based on photographs uploaded \((CF-Pho)\) The user/tourist’s rating on each item/POI is based on the number of uploaded photographs of that particular POI he/she has uploaded, i.e., a higher number of uploaded photographs correspond to a higher rating for that POI.

  • Collaborative filtering based on POIs visited \((CF-Bin)\) The user/tourist’s rating on each item/POI is based on whether they have visited that particular POI, i.e., a binary rating of 1 (visited) or 0 (not visited).

As CF-based algorithms recommend the top-K individual POIs instead of an itinerary of connected POIs, we implemented additional processing steps to ensure a consistent output result for our tour recommendation problem. Based on a starting POI \(p_1\) (like our PersTour algorithm), the CF-Pho and CF-Bin algorithms will iteratively add in the highest ranked POI from the top-K results, until either: (i) the budget B is exhausted or (ii) the destination POI \(p_N\) is reached.

The second category of baseline algorithms is variations of greedy-based approaches that have also been used in other tour recommendation research [4, 6, 46]. Similar to our PersTour approach, these baseline algorithms commence from a starting POI \(p_1\) and iteratively choose a next POI to visit until either: (i) the budget B is exhausted; or (ii) the destination POI \(p_N\) is reached. The sequence of selected POIs thus forms the recommended itinerary, and the three greedy-based baselines are:

  • Greedy nearest (GNear) Chooses the next POI to visit by randomly selecting from the three nearest, unvisited POIs.

  • Greedy most popular (GPop) Chooses the next POI to visit by randomly selecting from the three most popular, unvisited POIs.

  • Random selection (Rand) Chooses the next POI to visit by randomly selecting from all unvisited POIs.

GNear and GPop are meant to reflect tourists’ behavior by visiting nearby and popular POIs, respectively, while Rand shows the performance of a random-based approach.

5.3 Evaluation

We evaluate PersTour and the baselines using leave-one-out cross-validation [18], i.e., when evaluating a specific travel sequence of a user, we use this user’s other travel sequences for training our algorithms. Specifically, we consider all real-life travel sequences with \(\ge \)3 POI visits and evaluate the algorithms using the starting POIs and destination POIs of these travel sequences. Thereafter, we evaluate the performance of each algorithm based on the recommended tour itinerary I using the following metrics:Footnote 5

  1. 1.

    Tour recall: \(T_{R}(I)\) The proportion of POIs in a user’s real-life travel sequence that were also recommended in itinerary I. Let \(P_r\) be the set of POIs recommended in itinerary I and \(P_v\) be the set of POIs visited in the real-life travel sequence, tour recall is defined as: \(T_{R}(I) = \frac{|P_r \cap P_v|}{|P_v|}\).

  2. 2.

    Tour precision: \(T_{P}(I)\) The proportion of POIs recommended in itinerary I that were also in a user’s real-life travel sequence. Let \(P_r\) be the set of POIs recommended in itinerary I and \(P_v\) be the set of POIs visited in the real-life travel sequence, tour precision is defined as: \(T_{P}(I) = \frac{|P_r \cap P_v|}{|P_r|}\).

  3. 3.

    Tour: \(F_1\)-score \(T_{F_1}(I)\) The harmonic mean of both the recall and precision of a recommended tour itinerary I defined as: \(T_{F_1}(I) = \frac{2 \times T_{P}(I) \times T_{R}(I)}{T_{P}(I)+T_{R}(I)}\).

  4. 4.

    Root-mean-square error (RMSE) of POI visit duration: \(T_{\textit{RMSE}}(I)\) The level of error between our recommended POI visit durations in itinerary I compared to the real-life POI visit durations taken by the users. Let \(I^s \in I\) be the recommended POIs that were visited in real life,Footnote 6 and \(D_r\) and \(D_v\) be the recommended and real-life POI visit durations, respectively, RMSE is defined as: \(T_{\textit{RMSE}}(I) = \sqrt{\frac{\sum _{p \in I^s} (D_r-D_v)^2}{|I^s|}}\).

  5. 5.

    Tour popularity: \(T_{\textit{Pop}}(I)\) The overall popularity of all POIs in the recommended itinerary I defined as: \(T_{\textit{Pop}}(I) = \sum \limits _{p \in I} Pop(p)\).

  6. 6.

    Tour interest: \(T^u_{\textit{Int}}(I)\) The overall interest of all POIs in the recommended itinerary I to a user u defined as: \(T^u_{\textit{Int}}(I) = \sum \limits _{p \in I} \textit{Int}_u(\textit{Cat}_p)\).

  7. 7.

    Popularity and interest rank: \(T^a_{\textit{Rk}}\) The average rank of an algorithm a based on its \(T_{\textit{Pop}}\) and \(T_{\textit{Int}}\) scores ranked against other algorithms (1 \(=\) best, 12 \(=\) worst).

We selected these metrics to better evaluate the following: (i) time-based versus frequency-based user interest, using Metrics 1–3; (ii) personalized versus non-personalized POI visit durations, using Metric 4; and (iii) PersTour versus baselines, using Metrics 5–7. As personalized POI visit durations only apply to PersTour and not the baselines, we only report \(T_{\textit{RMSE}}\) scores for the PT-0, PT-.5F, PT-.5T, PT-.5U, PT-1F, PT-1T and PT-1U algorithms. Our baseline for comparing \(T_{\textit{RMSE}}\) is variants of PersTour that use non-personalized POI visit durations, i.e., average POI visit durations.

Fig. 2
figure 2

Overview of results (average scores) across all ten cities, in terms of recall (\(T_{R}\)), precision (\(T_{P}\)) and F\(_1\)-score (\(T_{F_1}\))

6 Results and discussion

In this section, we discuss our experimental results and highlight our main findings.

6.1 Comparison between time-based and frequency-based user interest

Table 2 Comparison between time-based user interest (PT-.5T and PT-1T) and frequency-based user interest (PT-.5F and PT-1F), in terms of recall (\(T_{R}\)), precision (\(T_{P}\)) and F\(_1\)-score (\(T_{F_1}\))
Table 3 Comparison between time-based user interest (PT-.5T and PT-1T) and frequency-based user interest (PT-.5F and PT-1F), in terms of recall (\(T_{R}\)), precision (\(T_{P}\)) and F\(_1\)-score (\(T_{F_1}\))

Figure 2 presents an overview of results in terms of the average recall (\(T_{R}\)), precision (\(T_{P}\)) and F\(_1\)-score (\(T_{F_1}\)) across all ten cities, for the different variations of our PersTour algorithm and the baselines. The results show that all variants of PersTour out-perform the five baselines in terms of \(T_{R}\) and \(T_{F_1}\) scores. In terms of \(T_{P}\) scores, the PersTour variants of PT-0, PT-.5T, PT-.5U and PT-.5F out-perform all baselines, while the CF-Pho and CF-Bin out-perform the PersTour variants of PT-1T, PT-1U and PT-1F. We now examine the performance of our PersTour algorithm and the baselines in greater detail.

Moving on to the results for individual cities, we study the performance difference between using time-based user interest and frequency-based user interest, as shown in Tables 2 and 3. Comparing the \(T_{F_1}\) scores between PT-.5T, PT-.5U and PT-.5F, and between PT-1T, PT-1U and PT-1F, the results show that PersTour using time-based user interest (PT-.5T, PT-.5U, PT-1T and PT-1U) out-performs its counterpart that uses frequency-based user interest (PT-.5F and PT-1F), in most cases. This observation highlights the effectiveness of time-based user interest in recommending tours that more accurately reflect real-life tours of users, compared to using frequency-based user interest. While PT-1T and PT-1U under-perform PT-1F in terms of \(T_{R}\) for some cities, we focus more on the \(T_{F_1}\) scores as it provides a balanced representation of both \(T_{R}\) and \(T_{P}\). Moreover, for all cities, PT-.5T, PT-.5U, PT-1T and PT-1U (time-based user interest) result in higher \(T_{P}\) scores, compared to its PT-.5F and PT-1F counterparts (frequency-based user interest). Another observation is that all PersTour variants also out-perform the five baselines for all cities, in terms of \(T_{F_1}\) scores.

Fig. 3
figure 3

Toy example illustrating the difference between time-based user interest and frequency-based user interest

The reason for the more accurate recommendations of time-based user interest compared to frequency-based user interest is due to its use of POI visit durations instead of POI visit frequency. Figure 3 illustrates a toy example that highlights the difference between time-based user interest and frequency-based user interest. Consider user A who only visited two parks but spent three or more hours at each of them and user B who visited five parks but spent less than 15 minutes at each of them. Frequency-based interest incorrectly classifies user B as having more interest in the parks category due to his/her five visits, compared to user A’s two visits. On the other hand, time-based interest more accurately determines that user A has a higher interest in the parks category due to his/her long visit duration, despite user A only visiting two parks. Furthermore, time-based interest can more accurately capture a user’s level of interest based on how much longer this user spends at a POI compared to the average user (e.g., a user is more interested if he/she spends 3 h at a POI when the average time spent is only 30 minutes). With the availability of user interest levels, we can better personalize POI visit duration for each unique user, which we evaluate next.

6.2 Comparison between personalized and non-personalized visit durations

Table 4 Comparison between personalized and non-personalized visit durations, in terms of RMSE (\(T_{\textit{RMSE}}\))
Table 5 Comparison between personalized and non-personalized visit durations, in terms of RMSE (\(T_{\textit{RMSE}}\))

The \(T_{\textit{RMSE}}\) scores in Tables 4 and 5 show that our recommendation of a personalized POI visit duration (Definitions 5 and 8) out-performs the non-personalized version in 62 out of 70 cases, based on a smaller error in the recommended POI visit durations. This result shows that personalizing POI visit duration using time-based user interests more accurately reflects the real-life POI visit duration of users, compared to the current standard of simply using average POI visit duration. Apart from recommending accurate POIs (\(T_{F_1}\) scores), recommending the appropriate amount of time to spend at each POI is another important consideration in tour recommendation, which has not been explored in earlier works that also aim to recommend tour itineraries.

Previously, we have observed how time-based interest results in more accurate POI recommendations based on the \(T_{F_1}\) scores. Our personalized POI visit duration builds upon this success by customizing the POI visit duration to each unique user based on his/her relative interest level, i.e., spend more time in a POI that interests the user and less time in a POI that the user is less interested in. Accurate POI visit durations have another important implication in tour recommendation, where spending less time at un-interesting POIs frees up the time budget for more visits to POIs that are more interesting to the user. Similarly, a user might prefer to spend more time visiting a few POIs of great interest, compared to visiting many POIs of less interest to the user.

Fig. 4
figure 4

Overview of results (average scores) across all ten cities, in terms of popularity (\(T_{\textit{Pop}}\)), interest (\(T_{\textit{Int}}\)) and rank (\(T_{\textit{Rk}}\)). Number within brackets indicates the rank based on popularity and interest scores, where 1 \(=\) best and 12 \(=\) worst

6.3 Comparison of popularity and interest

We now present an overview of the results in terms of the average popularity (\(T_{\textit{Pop}}\)), interest (\(T_{\textit{Int}}\)) and rank (\(T_{\textit{Rk}}\)) score for all ten cities, as shown in Fig. 4. In particular, we are most interested in the \(T_{\textit{Rk}}\) score that is derived from the average rank of an algorithm based on its \(T_{\textit{Pop}}\) and \(T_{\textit{Int}}\) scores, compared to other algorithms. For \(T_{\textit{Rk}}\) scores, a value of 1 indicates the best performance, while 12 indicates the worst performance. Based on this \(T_{\textit{Rk}}\) score, all variants of PersTour out-perform the five baselines, with PT-.5U being the best performer. We observe a similar performance in terms of \(T_{\textit{Int}}\) scores, where all variants of PersTour out-performing the baselines. In terms of \(T_{\textit{Pop}}\) scores, the PersTour variants of PT-0, PT-.5T, PT-.5U and PT-.5F out-perform all baselines, while PT-1T, PT-1U and PT-1F under-performs the GPop baseline. This performance is understandable as the PT-1T, PT-1U and PT-1F algorithms emphasize fully on user interest preferences, while the GPop baseline focuses on the most popular POIs thus maximizing the \(T_{\textit{Pop}}\) scores for the latter. Next, we provide a more in-depth discussion of the performance among the various PersTour variants.

Based on the \(T_{\textit{Rk}}\) scores in Tables 6 and 7, we observe that PT-.5U (time-based user interest with weighted updates) is overall the best performer, and PT-.5T (time-based user interest) is the second best performer.Footnote 7 In addition, we also observe that PT-1U (time-based user interest with weighted updates) out-performs its PT-1F counterpart (frequency-based user interest) for eight out of ten cities, with the same performance for the remaining two cities. These results show the benefits of applying weighted updates to user interests (PT-.5U and PT-1U), compared to simply using time-based user interest without weighted updates (PT-.5T and PT-1T).

Next, we examine how PersTour (with and without weighted updates) performs against the various baselines. Both PT-.5U and PT-.5T out-perform all baselines as well as its PT-.5F counterpart that uses frequency-based user interest. Similarly, PT-1T (time-based user interest) out-performs PT-1F (frequency-based user interest) for six out of ten cities, with the same performance for the remaining four cities. These results show the effectiveness of time-based user interest (both with and without weighted updates) over frequency-based user interest, based on the \(T_{\textit{Rk}}\) scores.

The effects of the \(\eta \) parameter can be observed in the \(T_{\textit{Pop}}\) and \(T_{\textit{Int}}\) scores. A value of \(\eta =0\) (PT-0) results in the best performance in \(T_{\textit{Pop}}\) and worst performance in \(T_{\textit{Int}}\), while a value of \(\eta =1\) (PT-1F, PT-1T and PT-1U) results in the opposite. While we include the \(T_{\textit{Pop}}\) and \(T_{\textit{Int}}\) scores for completeness, we are more interested in \(T_{\textit{Rk}}\) as it gives a balanced measurement of both \(T_{\textit{Pop}}\) and \(T_{\textit{Int}}\).

Table 6 Comparison of PersTour (PT) against baselines, in terms of popularity (\(T_{\textit{Pop}}\)), interest (\(T_{\textit{Int}}\)) and rank (\(T_{\textit{Rk}}\))
Table 7 Comparison of PersTour (PT) against baselines, in terms of popularity (\(T_{\textit{Pop}}\)), interest (\(T_{\textit{Int}}\)) and rank (\(T_{\textit{Rk}}\))

6.4 Comparison of PersTour with adaptive weights

To evaluate the effectiveness of using adaptive weights, we compare PersTour without adaptive weights (PT-.5U and PT-1U) against PersTour with adaptive weights (PT-AS and PT-AC). We focus mainly on the top and bottom 15% of users of each city, based on their number of total POI visits. The reason for choosing these users is that adaptive weights are most beneficial to such outlier users as we can recommend more personalized tours to users with many POI visits and popular tours to users with little POI visits.

Our main evaluation metrics are the \(T_{R}\), \(T_{P}\) and \(T_{F_1}\) scores as they indicate the effectiveness of adaptive weights in recommending tours that correspond to real-life visits. Table 8 shows that PT-AS has the overall best performance as indicated by the highest \(T_{R}\), \(T_{P}\) and \(T_{F_1}\) scores for seven, five and six cities, respectively, out of all ten cities. These results show the effectiveness of implementing adaptive weights for different users, i.e., a different level of emphasis between POI popularity and user interest preferences for different users.

Table 8 Comparison between PersTour with weighted updates and PersTour with adaptive weightings, in terms of recall (\(T_{R}\)), precision (\(T_{P}\)) and F\(_1\)-score (\(T_{F_1}\))

7 Conclusion and future work

We modeled our tour recommendation problem based on the Orienteering problem and proposed the PersTour algorithm for recommending personalized tours. Our PersTour algorithm considers both POI popularity and user interest preferences to recommend suitable POIs to visit and the amount of time to spend at each POI. In addition, we implemented a framework where geo-tagged photographs can be used to automatically detect real-life travel sequences and determine POI popularity and user interest, which can then be used to train our PersTour algorithm. Our work improves upon earlier tour recommendation research in three main ways: (i) we introduce time-based user interest derived from a user’s visit durations at specific POIs relative to other users, instead of using a frequency-based user interest based on POI visit frequency; (ii) we personalize POI visit duration based on the relative interest levels of individual users, instead of using the average POI visit duration for all users or not considering POI visit duration at all; and (iii) we introduce two adaptive weighting methods to automatically determine the emphasis on POI popularity and user interest preferences.

Using a Flickr dataset across ten cities, we evaluate the effectiveness of our PersTour algorithm against various collaborative filtering and greedy-based baselines, in terms of tour popularity, interest, precision, recall, F\(_1\)-score and RMSE of visit duration. In particular, our experimental results show that: (i) using time-based user interest results in tours that more accurately reflect the real-life travel sequences of users, compared to using frequency-based user interest, based on precision and F\(_1\)-score; (ii) our personalized POI visit duration more accurately reflects the time users spend at POIs in real-life, compared to the current standard of using average visit duration, based on the RMSE of visit duration; (iii) PersTour and its variants out-perform all baselines in most cases, based on tour popularity, interest, precision, recall and F\(_1\)-score; and (iv) our adaptive weighting methods further improve the performance of PersTour, based on precision, recall and F\(_1\)-score.

In this work, we focused mainly on recommending tours that are personalized to individual users based on their time-based user interest. Some possible directions for future work are:

  • Modeling uncertainty in POI visit duration based on the day of the week and time of the day. The main consideration for this work is to incorporate some uncertainty in the amount of time recommended at various POIs due to delays caused by crowds (e.g., POIs are more crowded during weekends than weekdays, thus causing possible delays)

  • Recommending tour itineraries that utilize multiple types of transport (e.g., walking, bus, train, taxi, car), instead of a single type of transport. The main motivation of this future work would be to offer users the flexibility to switch between different modes of transport, while excluding certain types (e.g., either bus, train or taxi but no walking).

  • When using public transport (e.g., bus, train, tram), recommend tour itineraries that consider the arrival and departure times of public transport to minimize the waiting time by the tourists for their respective public transport to arrive. Furthermore, we can also model uncertainty in the arrival times, especially when there are connections between multiple transport modes.