Keywords

1 Introduction

Planning an itinerary is a very common activity when organizing a visit to a tourist location. In doing that we usually take into account a number of dimensions. On the one hand, we consider the Places of Interest (PoI in the following) to be included based on our tourist interests. On the other hand, we take into account a variety of (i) temporal constraints such as the opening hours of the PoIs, the amount of time for visiting each PoI, the time to move from each PoI to the others, information about crowding at different times and (ii) temporal preferences such as the desire to visit as many PoIs as possible or to be more relaxed, the desire to have a lot of free time to stroll, the different alternatives for moving among PoIs.

Tourist recommender systems have been developed to support this process [1, 2]. In particular, the combination of PoIs in an itinerary attracted a lot of attention in the last decades, with the main objective to recommend an itinerary that maximizes a global profit/reward and can be completed within a specific budget [10, 17, 18].

Although many recommender systems based the creation of itineraries on the popularity of the attractions [26, 27], a number of approaches that generate personalized itineraries by considering incorporating user interests and preferences can be found in the literature [5, 11, 13, 14, 30].

In this paper, we propose an innovative approach for generating personalized itineraries that combines different aspects.

First of all, our proposed approach can inter-operate with any tourist recommender. The module that generates the itineraries takes a ranked list of PoIs as its input and we do not make any assumption on how this list is generated. It could be the output of any recommender that generates it based on any type of user preferences about PoIs. The advantages of this choice, especially as regards the evaluation of the approach, will be discussed later in the paper.

Second, we consider a number of time-related user preferences. We carried out a user study to understand which of these dimensions are relevant and how they are taken into account by different types of users [6]. The dimensions we considered range from the density of the itinerary (some users prefer to visit as many PoIs as possible while others prefer to visit fewer PoIs in more depth), the preference to visit PoIs when they are less crowded, the preference of having a lot of free time vs that of having a completely scheduled plan, the preference of mixing PoIs of different types vs that of having homogeneous itineraries with PoIs of the same type.

Third, we consider a number of temporal constraints such as the time to move between each pair of PoIs, the opening hours, the estimated minimum and maximum time to visit a PoI and the number of days of the visit. We thus include a temporal constraint propagation and satisfaction module in the itinerary generator.

Finally, we decided to design our itinerary generation module using genetic algorithms. These types of algorithms are very suitable for the problem we are facing which can be seen as a constrained optimization one. Indeed, genetic and evolutionary algorithms have already been used for this task in the literature, see for example [8, 27,28,29, 34]. A peculiarity of our approach is that the temporal preference dimension will play an essential role in the evaluation function of the genetic algorithm and constraint satisfaction will be used to prune itineraries that violate them and thus are not temporally consistent.

The paper is structured as follows. In Sect. 2, we provide an overview of the relevant state-of-the-art work. Section 3 discusses the temporal dimensions that we take into account in the generation of itineraries and introduces the structure of the temporal knowledge base. In Sect. 4, we discuss our genetic algorithm that generates itineraries taking into account temporal preferences and knowledge, while Sect. 5 illustrates the design and implementation of our prototype and its evaluation. Finally, in the last part of the paper, we discuss limitations and future works.

2 State of the Art

Recommender systems for itinerary generation have a long history in the state of the art. As seen in the introduction, much research has aimed to recommend an itinerary that maximizes a global profit within a specific budget [10, 17, 18].

In recent years, more proposals have incorporated user interests and/or specific preferences to personalize the construction of itineraries [5, 9, 11, 13, 14, 19, 24, 26, 27, 30]. Other approaches aim to recommend itineraries that consider personalized requirements such as attractions visit sequence [16], mandatory attraction categories [20] or group interest satisfaction [15, 21].

Time is a crucial aspect to be managed in recommenders for tourism [3, 23, 31] and especially in the case of itinerary recommendations. However, time is usually considered as a constraint more than an object of user preference as other aspects (categories of place, time of accommodation, etc.). To our knowledge, very few works consider user preferences on temporal dimensions [32, 33].

Evolutionary algorithms have been adopted in the literature for generating touristic itineraries. One of the first works in which a genetic approach was used to solve the itinerary planning problem was that of Chen et al. [8]. In their recommender system, a list of 30 PoIs is produced through a collaborative filtering method and then two distinct genetic algorithms are used: the first one to limit the number of recommended places to 10, minimizing the sum of expenses without exceeding the total available time; the second one to sort them minimizing the travelled distance. Representing a first attempt to use GAs in this domain, the proposed solution was only able to produce a path consisting of an ordered set of PoIs; following researchers tried to include in the recommendation a number of constraints and factors such as real visiting times, places’ opening hours, etc.

Other works designed algorithms to generate itineraries where the number of PoIs is not determined in advance. For example, Ostrowski [22] proposed a system which takes into account attractions’ opening hours and estimated travel times by walk or bus. Zheng et al. [34] also introduced two additional factors (aesthetic fatigue and the variable sightseeing value) in the profit formula, adopting a combination of a GA with a differential evolution algorithm.

The work in [28] investigated for the first time the insertion of restaurants in trips generated by a genetic-based recommender system. In their work, the opening hours of the tourist places and the location at which the user wants to start and end the trip are considered as constraints, while the only parameter taken into account to calculate the results’ fitness is the ranking value of the attractions and restaurants.

Compared with previous work, Tarantino et al. [25] developed an evolutionary-based recommender system that includes a greater number of preferences on context features (break intervals, PoI category) and various item characteristics, such as the location accessibility for disabled people and the type of PoI to be preferred (indoor/outdoor depending on the weather forecast). Furthermore, travel times are estimated taking into account the tourist’s walking speed and the queues at the PoIs.

Tenemaza et al. [27] tried to create multi-day itineraries by separating the itinerary recommendation problem in two parts: in their work, a k-means algorithm (with k equal to the number of available visiting days) is first used to cluster the best PoIs depending on their location and then a genetic algorithm is exploited to optimize the visit schedule for each geographical sector.

Yochum et al. [29] used an adaptive genetic algorithm to personalize touristic itineraries based on the quantity of PoIs present in the solution, their popularity, their overall rating, their cost and the user desire to visit certain places.

With respect to the works discussed in the literature, the genetic algorithm we propose recommends travel itineraries taking into account a wide range of temporal factors modelled as user preferences (e.g., the travel time, the presence of free time, the avoiding of estimated hours in which attractions are crowded) and constraints (e.g., the estimated visiting time for each PoI, the opening hours of recommended places, the start and end dates of the trip and the lunch breaks). The aspects that are personalized according to user preference also cover other factors, including the variety and quantity of PoIs, the desire to visit specific places and the ranking of the recommended PoIs. The advantages of considering user preference for all these factors are presented in Sect. 5. Furthermore, our recommender system is able to produce both single-day and multiple-day itineraries; in the latter case, itineraries presenting imbalances between the single days are assigned a penalty score and thus are unlikely to be suggested. In this way, especially for long trips, we expect the various visit days to be equally enjoyable for the user.

3 Temporal Dimensions and Temporal Knowledge Base

In this section we discuss the temporal dimensions that we take into account in the generation of itineraries and we present the structure of the temporal knowledge base.

We assume that the following temporal information about PoIs is available:

  • Opening hours for each PoI, for each day of the week.

  • Spatial coordinates of each PoI and thus time to transfer (on foot, by public transport or by car) between each pair of PoIs.

  • The average time for visiting each PoI, expressed either as an average duration or as an interval distinguishing between minimal time for a quick view and maximum time for an extensive visit.

  • (Qualitative) information about estimated crowding (e.g., “low”,“medium”,“high”) at different times of each day.

All the pieces of information above are available or can be easily obtained from the Internet. Indeed, in order to design our prototype for the city of Turin we extracted all the information from Google services. Opening and closing times are then expressed as points on the time line, crowded times can be expressed as time intervals, while the time to move between PoIs and visiting times are expressed as bound on differences [12], which is thus the formalism we adopted to represent all the temporal constraints in the knowledge base [4].

As regards users’ temporal preferences, the following dimensions are taken into account:

  • Time (e.g., number of days) available for the visit: date and time of arrival and date and time of departure.

  • User preference on maximizing the number of PoIs to be included in the itinerary. Some people may be interested in visiting as many places as possible, while others may prefer to have more relaxed times between the planned activities.

  • User interest in visiting heterogeneous PoIs. If this information is provided by the individual, the itinerary recommender will try to include different types of PoIs in (each day of) the itinerary (e.g., natural parks, museums, religious places, historical places and buildings, etc.).

  • Willingness to avoid crowding, i.e., whether the user desires to keep away from crowded PoIs or at least crowded times of the day.

  • Willingness to have some free time during the day, i.e., whether the user prefers dense or sparse itineraries. In the latter case, a break interval is added to each day of the itinerary.

  • User preference concerning means of transport and the desire of minimizing transfer times.

These pieces of information constitute the User Model which will be exploited by our algorithm to generate itineraries. We carried out a study on whether these preferences can be correlated to user’s personality and how the personality traits can be used to shape itinerary factors in recommender systems [6].

In the prototype, we derived the user model from a simple questionnaire filled out by the users.

4 Combining Genetic Algorithms and Constraint Satisfaction for Generating Itineraries

In this section, we discuss the algorithm that generates a ranked list of itineraries that are most suitable for a given user. The approach relies on a genetic algorithm which exploits constraint satisfaction in the evaluation process, as we will clarify in the following subsections.

4.1 Inputs to the Algorithm

The algorithm starts by taking the following inputs:

  • A list L of PoIs, ranked according to the user’s interests. As we noticed in the introduction, our itinerary generation process can be coupled with any tourist recommender. We only assumed that a ranked list is produced in some way. This assumption allowed us to decouple the analysis (and evaluation) of the itinerary recommender from one of the algorithms that suggest to a user the PoIs that are most suitable for her/him. Moreover, in this way, our system can inter-operate with any tourist recommender.

  • The temporal knowledge base, as discussed in the previous section.

  • A user model with user preferences concerning the temporal dimensions, as discussed in the previous section.

4.2 Genetic Algorithm

The genetic algorithm works on a population of itineraries. Thus an itinerary is an individual in our approach. In more detail, an itinerary is an allocation of PoIs along the timeline whose extension depends on the duration of the visit to be planned. An itinerary is valid if it is temporally consistent, i.e., if the time constraints concerning the opening times of the PoIs, the time to move from a PoI to the next one in the itinerary and those concerning the visit times are consistent.

The genetic algorithm starts with an initial population of random valid itineraries and loops producing subsequent generations until the “quality” of the population is “satisfactory”. Let us analyze the process in more detail.

Evaluating the Fitness of Individuals. An individual, i.e. an itinerary receives a fitness evaluation taking into account two aspects:

  • The evaluation of the PoIs in the itinerary according to the ranking in the list L. In this way, an itinerary containing PoIs that have a higher ranking in L receives a better evaluation.

  • The evaluation of the itinerary according to the user temporal preferences. Each type of preference is considered separately and we adopted a number of heuristic evaluation criteria:

    • The itineraries are ranked according to the number of PoIs they contain; this ranking contributes positively to the evaluation of an itinerary if the user expressed a preference for maximizing PoIs during the visit, negatively otherwise.

    • Similarly, itineraries are ranked as regards the amount of free time and also in this case the ranking can contribute positively or negatively based on the user’s preference.

    • The presence of PoIs visited during crowded times can impact negatively the evaluation if the user prefers to avoid crowding.

    • The heterogeneity of PoIs contributes positively if the user prefers this type of itinerary, and negatively otherwise.

    • High transfer times can impact negatively if this dimension is relevant to the user.

    • In case planning involves multiple days, balancing among days is taken into account. Itineraries where the fitness in each day is similar are evaluated better than the others. Those that are unbalanced receive a penalty score that negatively impacts the fitness value. Fitness similarity is calculated using standard deviation.

As a result each itinerary receives an evaluation which allows us to rank them inside the population.

Crossover: Generating Descendant Individuals. At each iteration, some individuals in the current population are combined to generate individuals in the new population. The higher the fitness value of the individuals the more likely they are to be combined. Given two individuals \(I_1\) and \(I_2\), a descendant \(I_3\) can be generated in different ways combining parts of \(I_1\) with parts of \(I_2\); for example, the algorithm may take the morning parts of \(I_1\) and the afternoon parts of \(I_2\) or it can alternate PoIs from \(I_1\) to PoIs from \(I_2\). We analyzed both of these strategies and in the current implementation, we selected the first one mentioned above that seemed to provide better results.

Aborting Temporally Inconsistent Individuals. When a new individual \(I_3\) is generated, we run the constraint satisfaction algorithm to check whether it is temporally consistent. If it is not, the individual is aborted and it will not be part of the new population.

Mutations. Periodically some mutations are introduced in the population. An individual I is mutated by randomly selecting some PoIs in the itinerary and replacing them with other ones (possibly with better ranking). Mutated individuals must pass the constraint satisfaction check to enter the new population.

Termination and Solutions. The process above is iterated producing a number of generations. We made experiments with two termination strategies: (i) a fixed number of iterations and (ii) iterating until there is no significant improvement in the evaluation of the best itineraries in the population (see Sect. 5). When the process is terminated a ranked list of itineraries is produced and can be presented to the user.

5 Prototype: Recommending Itineraries in Turin

In order to evaluate our approach, we developed a prototype providing personalized recommendations of itineraries for our town. The goal of the prototype is twofold. On the one hand, it was used to make experiments with and tune the algorithm. On the other hand, it was exploited to perform a preliminary user evaluation.

To this aim, we fist developed a knowledge base with 30 PoIs in Turin, including a variety of attractions (museums, parks, churches, historic buildings, historic squares \(\ldots \)). We then collected all temporal information about the PoIs thus creating the temporal knowledge base.

Fig. 1.
figure 1

Two example PoIs for users to evaluate in our prototype

Tuning of the Algorithm. During the design of the algorithm, we made many experiments varying the dimensions of the population, the crossover and mutation rate, and the number of iterations, so as to find the optimal parameter setting for our application case. This allowed us to tune the algorithm, with the aim of providing optimal results through our prototype. As far as the number of iterations is concerned, we observed that 30 iterations produce stable rankings, but we cannot claim that this can be generalized. Thus, we believe that further experiments should be carried out to better assess the approach and its sensitivity to changes.

Preliminary User Tests. In order to evaluate our approach, we designed an online experiment consisting of several steps, where participants are asked to evaluate different sets of itinerary recommendations. We carried out a preliminary evaluation with 20 people, aged from 18 to over 60, recruited among the acquaintances of the authors using a convenience sampling strategy. Participants were asked to explicitly express their preferences (i) about the PoIs in our knowledge base (see Fig. 1), and (ii) about temporal dimensions (namely: minimization of transfer times, maximization of the number of PoIs per itinerary, maximization of the heterogeneity among PoIs in a certain itinerary, maximization of the number of PoIs they have a strong preference for, the inclusion of free time slots, busy hours avoidance), using a Likert-like scale ranging from 1 (minimum) to 5 (maximum) in both cases.

Fig. 2.
figure 2

An itinerary example

Then, we ran the itinerary recommender to generate personalized itineraries for each user (Fig. 2): once excluding temporal dimensions (R0), once taking such dimensions into account, but with predefined weights (R1), equal for all participants, and, finally, once considering participants’ preferences over the temporal dimensions (R2). Each participant was therefore presented with three alternative itinerary recommendations (randomized to avoid order effects) and was asked to assess, for each of them, how good the recommended itinerary is on the whole (overall evaluation).

The results we obtained are encouraging. In fact, itinerary recommendations generated taking into account participants’ preferences over the temporal dimensions (R2) were evaluated more positively than the other two types of recommendations. More specifically, R2 recommendations obtained an average score of 3.4 (st. dev: 0.94) for the “overall evaluation” aspect, which is higher than the scores obtained by R1 (avg: 3.1, st. dev: 1.07) and R0 (avg: 3.2, st. dev: 0.89) recommendations.

6 Concluding Remarks

In this paper, we presented an approach for generating personalized tourist itineraries using genetic algorithms coupled with temporal constraint satisfaction. The approach is independent of the way PoIs are ranked based on user’s interest and can thus be coupled with any tourist recommender providing this ranking. This choice was motivated by the aim of focusing on evaluating the approach for building itineraries independently of the approach for ranking PoIs. The approach we proposed uses a number of heuristics in the evaluation of the suitability of an itinerary with respect to user’s temporal preferences. These heuristics could be improved in many different ways and indeed we are currently studying the impact of changing the heuristics.

The preliminary evaluation we performed with the prototype provided very encouraging results: in fact, recommendations which take into account temporal dimensions appear to be better in terms of their overall evaluation. A more extensive evaluation will be carried out with a larger number of participants, following the same experimental protocol, in order to confirm these results and test for their statistical significance.

Furthermore, as future work, we will work on incorporating negative preferences in order to fine-tune our personalized algorithm [7].