Keywords

1 Introduction

A street network, namely a system of interconnecting lines and points that represent a system of streets or roads for a given area, provides the foundation for network analysis; for example, finding the best route. One of the most remarkable Operational Research (OR) problems related to a street network is that faced by a tourist who has to visit an area of a city in a limited amount of time. In this case the aim is that of selecting the most interesting places to visit on the basis of their personal interests, and of creating the shortest suitable route connecting them. In OR such a problem is usually modeled as a graph.

To plan a feasible tour the tourist has to collect information from different sources (websites, magazines or guidebooks) about the different Points of Interest (POIs), make a selection among these POIs and plan an itinerary connecting the selected points, considering the available time, the opening hours and visiting times of the different POIs, the weather forecasting and other different kinds of constraints [1]. Given the complexity of the problem, the building of the tour by combining all the preferences and constraints presents considerable difficulties.

In the last years several Personalized Electronic Tourist Guides (PETGs) relying on mobile computing have been developed to better perform the task fulfilled by the local tourist organizations.

Within this paper the design of a personalized multiple–day walking tour in old city centres, which takes into account a set of POIs with a score, a set of waiting and visiting times, and a set of daily opening hours of these POIs, is investigated. Combined with the tourist’s trip constraints and environmental contexts, these sets allow defining a problem which is an extension of the well–known Team Orienteering Problem with Time Windows (TOPTW) where the number of team members is replaced by the number of days available for the tourist to stay, the start and the arrival positions are not necessarily coincident in each day of the tour, and can change from day to day [2, 3].

Since TOPTW is a highly–constrained combinatorial optimization problem [4] that cannot be solved in polynomial time [5, 6], a multiobjective evolutionary optimizer is proposed to find in a reasonable computational time near–optimal solutions to the planning of a multiple–day route. Such an optimizer is innovative as it considers a higher number of objectives and of features with respect to some recent tools for building walking–only itineraries [2, 7, 8].

The paper is organized as follows: Sect. 2 presents a description of the optimizer used to deal with the problem under investigation. Section 3 describes the multiobjective evolutionary algorithm employed to find multiple–day tours. In Sect. 4 the findings of the proposed approach are shown and discussed. Finally, the last section contains conclusion remarks and future works.

2 The Generator of the Personalized Tour

The core of the hypothesized PETG, schematized as in Fig. 1, is our optimizer, i.e., the generator of the personalized tour. Such an optimizer is supposed to interact with several input modules to gather information about the tourist’s interests and requirements, and environmental constraints, and with an output module to endow the tourist with the planned multiple–day itinerary.

Fig. 1.
figure 1

The architecture of the PETG.

Our optimizer needs the following input information:

  • Information provided by the user: the user identifier, the average moving speed (young, old, family with children, etc.), the wheelchair or disabled access, the day of tour beginning, the number of days available for the tour, number and identification codes of the POIs that the user requests to be included in the visit. Moreover, for each day the optimizer needs the start/arrival positions and the start and finish times of programmed pauses.

    The start and arrival positions, represented by the GPS coordinates, i.e., longitude and latitude, are selected by the user either from the map, or from the list of tourist attractions or from a set of locations (‘access doors’) corresponding to the public transportation stations/stops close to the area.

  • Rating of the POIs. The evolutionary optimizer hypothesizes a recommender system which assigns the ‘score’ or ‘rating’ to each POI on the basis of the personal preferences of the tourist. The rating, which measures the POI attractiveness for that specific user, is supposed to be estimated by a profiling phase previously effected either through the data extracted from the social media or from a questionnaire which the same user is invited to fill during the preliminary connection phase.

  • POI information. The information related to each POI, to the distance and routes between POIs, and to the context is stored in suitable databases. Specifically, the data supposed to be known for each POI are the POI identifier, the name of the POI, the position in terms of GPS coordinates, the average waiting and visiting times, the opening hours for each day of the tour, the accessibility for disabled people, a flag as outdoor or indoor site.

  • POI distances. The POI distances are represented by a real–valued triangular matrix with a dimension equal to the number of POIs. The value of each cell \((i,j)\) of the matrix indicates the distance between the POIs \(i\) and \(j\) measured on the basis of the shortest route connecting them. This means that it is not the Euclidean or Manhattan distance evaluated on the basis of the GPS coordinates of the two POIs, rather it accounts for the actual structure of the streets in the area containing the POIs.

  • Weather table. A weather table is used to propose, as far as possible, itineraries which take into account the weather conditions. The hourly forecasting is extremely complex, so a granularity equal to three hours (one hour before or after the time interval indicated) is considered As a consequence, for each day of the tour this table contains the weather forecasting subdivided in a number of items equal to that of the three–hour time slots in which the tour falls.

  • Previous tours. Particularly useful is the data file which keeps track of the previous tours effected by the tourists in case of either a multiple–day tour or a tour requested by a user who previously visited the same area. In both cases the knowledge of previous tours avoids proposing an itinerary including already visited POIs, unless the user explicitly requires to visit again some such POIs. Each record of this file is related to a tour made by a single user and reports the user identifier, the day(s) of the tour, the number and the identifiers of the visited POIs.

2.1 Output of the Optimizer

The output of our evolutionary optimizer is the personalized tour automatically saved in a .csv file, named tour.csv, that is processed by an API devised to visualize the corresponding optimized tour on a map directly on the user’s mobile device. The data stored in tour.csv are:

  • The start time.

  • For each POI: the identifier of the POI, the walking time between the current position and the POI, the arrival time at the POI, the waiting time and the finish time of the visit.

  • For each pause: the start and finish times.

  • The walking time to the final destination and the related arrival time.

Moreover a summary of the more relevant information related to the proposed tour is reported in terms of the total number of visited POIs specifically requested by the user, the total number of visited POIs, the total time employed for the tour including waiting, visiting and transfer times, the total covered distance and the score of the complete tour measured as the average of the score of the single POIs included in the tour.

2.2 Problem Statement

To generate personalized multiple–day itineraries respecting predefined time windows, the problem is modeled as a graph consisting of a set of locations and a set of the paths between each pair of these locations. A non–negative score representing the rating of the tourist for that location and a set of time windows is associated with each vertex. A positive weight corresponding to the walking time is associated with each path.

The Objectives. The goal of our optimizer in building a personalized multiple-day itinerary is to optimize five contrasting objectives, namely,

  • maximize the score of the proposed POIs;

  • visit as many POIs as possible among those explicitly requested by the user;

  • visit as many POIs as possible among the recommended ones in addition to those specifically included by the user;

  • complete the tour within the time limit fixed by the user respecting the following commitments: opening and closing hours of the POIs, average visit duration and pause times (rest, lunch, shopping, ... ) required by the user;

  • minimize the covered distance;

  • respect other constraints such as: accessibility to the POIs for people with disabilities and usability of the outdoor POIs in case of rain, high wind, etc.

Differently from the opening times of the POIs, which are hard constraints, the pause times requested by the user are handled as being soft temporal constraints, whose satisfaction can happen in suitable neighborhoods of the times requested by the user if this flexibility allows better encountering her/his preferences in terms of the accessibility and of the number of visited POIs. A detailed description of all the objectives is reported in the following items.

  • Rating of the tour by the tourist side. As each generic POI \(i\) is characterized by a real value \(a(i)\) which represents its attractiveness for a specific user, the rating \(\varPhi _1\) of the complete route tour is equal to:

    $$\begin{aligned} \varPhi _{1}(tour) = \frac{1}{POI\_in\_tour} \sum _{i=1}^{POI\_in\_tour} a(i) \end{aligned}$$

    that is the average of the scores of the single POIs. \(POI\_in\_tour\) represents the number of POIs comprised in the proposed tour. This objective is to be maximized.

  • Number of the POIs explicitly required by the user and actually present in the proposed tour. Denoting with \(POI\_req\) the number of the POIs that the user requires to be comprised in the tour, an ideal tour will be one which includes all the \(POI\_req\). Nevertheless, due to the constraints, the tours proposed by the optimizer could include an actual number of required POIs, named \(POI\_act\), lower or equal to \(POI\_req\). In formula:

    $$\begin{aligned} \varPhi _{2}(tour) = POI\_req - POI\_act \end{aligned}$$

    This objective is to be minimized.

  • Number of total POIs included in the tour. The user is naturally interested in visiting as many POIs as possible in the time available for her/his tour. Therefore the optimizer, in addition to the \(POI\_act\), has to propose a tour able at the same time to assure the greatest rating and to contain the highest number of POIs encountering at best user’s preferences. So we can define the following objective:

    $$\begin{aligned} \varPhi _{3}(tour) = POI\_in\_tour \end{aligned}$$

    This objective is to be maximized and must respect all the constraints in terms of time, tour length, presence of all the desired POIs and so on.

  • Total duration of the tour. It is supposed that the user desires to spend as much time as possible for the visit of the attractions, saving time for pauses (rest, lunch, shopping). To evaluate a generic tour the proposed optimizer takes into account the following variables:

    • \(t\_tran(i,j)\) indicates the time needed to transfer from a POI \(i\) to \(j\) where \(j\) represents the POI successive to \(i\) in the tour under examination (in the case of the first POI it represents the time to reach the first POI from the start point chosen by the user);

    • \(t\_wait(j)\) represents the waiting time to access POI \(j\), due to queues, or other constraints;

    • \(t\_vis(j)\) denotes the visiting time of the POI \(j\);

    • \(t\_tot\_vis(j)\) accounts for the total of the three above–mentioned times;

    • \(t\_arr\) represents the time to reach the finish point chosen by the user from the last POI in the proposed tour;

    • \(t\_pau(k)\) represents the duration of the generic pause \(k\) required by the user (let \(Tot\_pau\) be the total number of these pauses);

    • \(max\_time\_tour\) denotes the time limit declared by the user to complete the tour.

    This premised, the total time of the tour \(\varPhi _4\) is given by:

    $$\begin{aligned} \varPhi _{4}(tour) = t\_arr + \sum _{j=1}^{POI\_in\_tour} t\_tot\_vis(j) + \sum _{k=1}^{Tot\_pau} t\_pau(k) \end{aligned}$$

    This quantity is to be maximized while respecting the constraint \(\varPhi _{4}(tour) \leqslant max\_time\_tour\).

  • Total length of the tour. Another important side to consider is the total covered distance of the tour which must be not excessively long in particular for some categories of potential users, as for example elderly people, families with children, or disabled people. It is to note that the classical Traveling Salesman Problem (TSP) which considers only the distances in order to find an itinerary cannot be used in this case. In fact, for a PETG, in addition to the spatial coordinates, also the temporal side is to be taken into account, as for example the opening and closing hours of the POIs. However, it is important also to account for a spatial objective with the aim to minimize the length of the tour proposed. Such a length is evaluated as:

    $$\begin{aligned} \varPhi _{5}(tour) = d\_init + \sum _{i=1}^{POI\_in\_tour} d(i,i+1) + d\_fin \end{aligned}$$

    where \(d(i,i+1)\) is the distance between a generic POI \(i\) and the next POI \((i+1)\) in the considered tour, \(d\_init\) is the distance between the start position declared by the user and the first POI in the tour under examination, \(d\_fin\) is the distance between the last POI in the tour and the finish point declared by the user.

  • Other constraints. The generated solutions must respect these further constraints: in case of rain or high wind outdoor POIs are to be excluded from the itineraries, unless explicitly requested by the user, and in case of tours required by disabled people the POIs with limited access are to be discarded.

3 The Multiobjective Evolutionary Optimizer

To find near–optimal personalized multiple–day tours a multiobjective evolutionary algorithm, able to satisfy all the contrasting objectives reported in Sect 2.2, is proposed.

3.1 Multiobjective Optimization Notions

To deal with the five contrasting objectives mentioned above, a multiobjective evolutionary algorithm based on the concept of the so–called Pareto optimal set is designed and implemented. To make this paper self–contained we report some fundamentals of the multiobjective optimization. This technique relies on the notion of dominance: as an example, for a problem with two objectives to optimize, each represented by a fitness function \(\varPhi _i\), representing the quality of the solution with respect to the objective, a solution \({\varvec{X}}_{1}\) is said to dominate in the Pareto sense (P–dominate) another solution \({\varvec{X}}_{2}\) if and only if, for any objective, the related \(\varPhi _i({\varvec{X}}_{1})\) is not worse than \(\varPhi _i({\varvec{X}}_{2})\) and is better for at least one of the objectives. A solution \({\varvec{S}}^{{\varvec{*}}}\) is said Pareto–optimal if it belongs to Pareto optimal set. The Pareto–optimal set and the Pareto–optimal front are the sets of Pareto–optimal solutions in design variables and objective function domains, respectively. By doing so, at each generation a set of “optimal” solutions, namely the current Pareto front, emerges where none of them can be considered to be better than any other in the same set. As the number of generations increases the current Pareto front will shift, and will hopefully approach the Pareto–optimal front. Usually, at the end of the execution of the evolutionary algorithm the final Pareto front will be proposed to the user that, among the solutions contained therein, will choose the one which best suits his needs.

3.2 The Methodology

An evolutionary algorithm (EA) is a population–based optimization algorithm which uses mechanisms inspired by biological evolution to find approximate solutions for complex problems [9, 10]. As in each EA, once initialized, a population of \(NPOP\) solutions, called individuals, is let free to evolve from a generation \(g\) to the next one by means of the operators of selection, recombination, and replacement. The components of the EA within this paper are standard with the exception of the mutation operator that is specific for the faced problem, and of the selection which is typical of this multiobjective version. In our case the basic steps of the algorithm can be described as follows:

  • Initialization: an initial population of individuals is randomly generated.

  • Selection: the choice of the individuals which undergo recombination takes place by a random uniform selection among the non dominated solutions. At the end of each generation, it is important to sort the solutions to individuate those belonging to the Pareto front (non–dominated solutions).

  • Recombination: given two elements selected among the non–dominated solutions, we apply as evolutionary operators the uniform crossover [11, 12] and the mutation to recombine the solutions. As mutation the classical exchange [13, 14] and 2–opt variants [13, 15] are employed. As a result one offspring is obtained.

  • Evaluation: being the problem structured as having the five objectives described in Sect. 2.2, the fitness of each solution will be evaluated on each of those five optimization criteria. Considered that this implies the resolution of a multiobjective problem we will make reference to the notion of dominance reported in the above Sect. 3.1.

  • Replacement: the \(i\)–th offspring obtained by the recombination is compared with the \(i\)–th individual in the current population and the already present individual is replaced only if it is dominated by the new generated one.

The four last steps are repeated for each individual so that a new population is obtained. This procedure is repeated for a maximum number of generations \(g_{max}\), with the aim to individuate the best Pareto front solution to be presented as the output of the optimizer. We consider this solution as the one with the lowest distance from the theoretically optimal solution, i.e., the one which perfectly satisfies all the five objectives. This “best” solution is generally located in the intermediate region of the front and is the one which yields a better balance in satisfying all the objectives.

Encoding. Each individual in the population represents a potential multiple–day tour and is encoded by a vector of integer values with dimension equal to the number of POIs. This dimension is denoted with NPOI. Each integer denotes a POI and is present only once in each solution. For example a solution with NPOI=10 is shown in Fig. 2.

Fig. 2.
figure 2

Example of a tour encoding.

This solution, starting from the position chosen by the user (not explicitly contained in such an encoding), proposes to reach the POI in the first cell at the left side of the vector (9 in the figure) and then proceed forward visiting the POI in the second cell (1 in the figure) and so on. Differently from the TSP in which all the points are visited in the order indicated by the solution, in our PETG it is highly probable that not all the POIs are effectively visited. This can happen for several reasons:

  • remaining available time: the tour must terminate when the residual available time is only sufficient to reach the final point from the POI in which the user currently is;

  • closing: the tour can lead the user to a POI during its closing time;

  • previous tours: the POI has already been visited in a previous tour and thus it will be not considered if not expressly required again;

  • multiple–day tour: if the user has required a tour programmed in several days, the POI will be discarded if already included in the tour of one of the previous days.

If a POI is actually visited in the tour, a flag will be set for it. This allows the management of a multiple–day tour. In fact, for the second day the examination of the solution restarts from the leftmost position in the vector as for the first day and all the POIs already visited in the first day are now skipped. Analogously, for each further day, the vector which represents the solution will be examined again, always starting from its leftmost position.

4 Experiments

The algorithm has been coded in Java language and all the tests have been performed on a MacBookPro4.1 Intel Core Duo 2.4 GHz, 2 GB RAM. After a preliminary tuning phase, the population size \(NPOP\) has been set equal to 500, while the number of generations \(g_{max}\) has been set equal to 100. The crossover probability \(CR\) has been set equal to 0.4. For the mutation, the exchange probability \(EM\) has been fixed to 0.8 while the parameter \(FV\) for the 2-opt mutation has been set equal to 1.0. Since evolutionary algorithms are nondeterministic, to individuate the best tour for any given problem 10 runs are performed. The execution time for all the 10 runs is about 13 s.

The algorithm is able to provide multiple–day walking tours in any area once the needed information are made available. Within this paper its ability has been tested for the area of the old city centre of Naples, Italy, by considering 20 POIs. These POIs are listed in Table 1 together with some of the relevant information used for the building of the personalized tours. The list of the used information is not exhaustive. The waiting and visiting times are expressed in minutes. The possible start and finish points of the daily tours, outlined in Table 2, surround the selected area of interest and represent the ‘access doors’ to the old city centre through the local transportation means.

Table 1. The POIs for the old city centre of Naples.
Table 2. The access points for the old city centre of Naples.

In the following, an example of a personalized tour generated as a function of the input information provided by the user is shown. The rating is quantified within the range [0, 100]. The value 100 is assigned to each POI that the user expressly requires as belonging to the proposed tour. Moreover, beside the waiting and visiting times, also the walking and the total tour times are reported in minutes, while the walking and the total covered distances are measured in meters. Lastly, the weather is considered sunny during the whole visit.

In the example the case is considered that the user wishes to perform a two–day tour and requires to visit five \(POI\)s as well.

The input information provided by the user is the following:

  • Day of tour beginning: 01/05/2014

  • average moving speed: 0.5 m/s

  • disabled access request: no

  • number of days available for the tour: 2

  • number of POIs that the user declares to be included in the visit: 5

  • identification codes of these POIs: 3 6 9 12 15

  • For the first day of the tour:

    • start and arrival times: 9.00 am - 7.00 pm

    • start and arrival positions: Dante (M1) - Università (M1)

    • number of programmed rests: 3

    • For each pause:

      • \(*\) start and finish times of pause 1: 10.45 am - 11.15 am

      • \(*\) start and finish times of pause 2: 1.00 pm - 2.30 pm

      • \(*\) start and finish times of pause 3: 5.00 pm - 5.30 pm

  • For the second day of the tour:

    • start and arrival times: 9.30 am - 4.30 pm

    • start and arrival positions: Duomo (M1) - Museo (M1/M2)

    • number of programmed rests: 2

    • For each pause:

      • \(*\) start and finish times of pause 1: 11.00 am - 11:30 am

      • \(*\) start and finish times of pause 2: 1.30 pm - 2.30 pm

The best tour determined in all the runs in accordance to the user information and the rating of the POIs derived from the user’s profile is:

$$ \left( \small { \begin{array} {cccccccccccccccccccc} 17&18&3&2&15&6&7&9&12&11&5&4&16&1&13&18&20&19&10&4 \end{array} } \right) $$
Table 3. The output of the optimizer.

The tour provided by the algorithm for each day is shown in Table 3. The two-day tour associated to this output file is also graphically reported in Fig. 3 over a map of the area representing the old city centre of Naples. This figure evidences each day tour with a different color, namely the first day is reported in red whereas the second one is in blue. As it is simple to verify from Table 3, the tour proposed allows visiting all the POIs expressly required by the user. Moreover, the visualization reported in Fig. 3 demonstrates even more the effectiveness of our evolutionary optimizer. In fact, the proposed two-day tour carries out an “intelligent” optimization by automatically subdividing the area of the old city into two distinct zones, each visited in a different day. It is worth noting that on the first day the planned visit to \(POI\) 3 (Church of Santa Chiara) ends at 12.39  pm, i.e., in good time before this \(POI\) closes at 1  pm for lunch pause.

Fig. 3.
figure 3

The two-day tour proposed for the fourth example.

5 Conclusions and Future Works

Within this paper a multiobjective evolutionary optimizer for solving a TOPTW, characterized by multiple and contrasting objectives, is proposed. Such an optimizer is innovative as regards the number of both optimization criteria and features considered. Its ability to generate personalized multiple–day walking tours in a street network modeled as a graph, respecting user’s interests and limitations, and environmental constraints, has been successfully tested for the old city centre of Naples. This optimizer will constitute the core of a PETG that will be distributed as a free app for use in the above mentioned area.

As future works, we aim to endow our optimizer with a higher flexibility by providing a route planning capable of adapting to new circumstances in real–time to assure an on–the–fly tour updating.