1 Introduction

Innovation in Information and Communication Technologies (ICT) has substantially transformed the interaction between stakeholders and operating processes. Tourism has not been immune from this transformation with the introduction of technologies that have substantially changed the ways how tourist experiences his/her travel, the ways of interacting between the tourist and the tour operators of a destination as well as how the stakeholders operate by integrating, more and more often, the different operational processes. The Internet and the various ICT solutions have contributed to deeply change the tourism market, so much so that today we talk about Tourism 2.0. The tourism sector represents approximately 11% of global Gross Domestic Product (GDP), employing 200 million people and serving 700 million tourists around the world, for this reason the tourism industry is defined as one of the largest manufacturing sectors in the world (Kabassi 2010). Customers and stakeholders of the sector usually analyse the information on the web, search about different options and evaluate possibilities, analysing data by themselves (Nemade et al. 2017). Moreover, the rapid development of Social Media and Web 2.0 provides a great potential for changing electronic services from a product-driven environment to a social and customer-driven one, giving to the stakeholders and the service providers access to social experience and knowledge for a better understanding and more effective decisions (Dennison and Bourdage-Braun 2009).

The great availability of information, the possibility of being able to “live” the journey in advance thanks to multimedia content, the opportunity to compare thousands of offers from all over the world, the speed of making payments are among the features that make the win–win combination between ICT and tourism sector. From the tourist point of view, the huge amount of options available on the web could have the side-effect to provide confusion in booking holidays. On one hand, the possibilities offered by the various ICT solutions greatly simplify the life-cycle of the tourist experience, i.e. the process accompanying the tourist from the moment he/she plans the trip to when he/she returns to home. This aspect has been largely developed by Tourism Recommender Systems (TRS) that filter relevant information for the user in order to provide a decision support based on his/her explicit preferences (Kontogianni et al. 2018). On the other hand, the spread of ICT in the tourism sector has allowed tourism services to increase their efficiency and market power, in terms of both cost reductions and the expansion of the functionalities offered to customers, stimulating both the birth of relations between companies and stakeholders in general and the creation of collaborative networks. In the era of Big Data, a huge use of the information available on the web has been registered in different business sectors, and it is also true for the tourist one (Ardito et al. 2019). The new technologies make easier to find information, to increase the speed of transactions and to expand the range of services available, but above all, they allow to generate an active collaboration space in the creation of the tourist offer. In this way, it is possible to enlarge the perspective of a simple TRS, defining a real Decisions Support System (DSS) (Power 1997) with a set of statistical and sensitive analysis tools that can help the stakeholder and the decision makers to collect information about the liking of particular Points and Events Of Interest (PEOI) and apply investment or corrective actions on that base.

In this paper, we describe a DSS developed within a R&D project, funded by the Italian Ministry of Economic Development, named SMARTCAL (Smart Tourism in Calabria), which aim is to strengthen the tourism sector from both the end-user and the decision-maker perspective. These goals have been achieved by integrating multiple information sources such as user preferences, events calendar, routes, timetables and opinions about destinations gathered from social media. The result is a web platform and a smart-mobile application that helps the collection of information about PEOI through different tools based on sentiment analysis on social networks, in order to facilitate the matching between offer and demand. In the following, we describe the architectural structure of the system, focusing the attention on the main methodological modules that guarantee the pro-activity and the capacity of profiling the users, the sentiment analysis tool and the tour planner.

Given the crucial importance of tourism to regional and sustainable development, such a system may affect social, political, cultural and economic activities. Indeed, the possibility to analyse and extract knowledge from the sentiment analysis tool provides a qualitative perspective on the appreciation of different PEOI, underlining margins of improvement for the decision makers, expressed by a set of performance indicators. Furthermore, all the techniques related to the social networks analysis have been also used to quantify the scores associated with the different PEOIs, in order to build a proactive tourist tour planner that can be very effective from the user point of view. The tourism potential is determined by the sum of all resources (natural, human, cultural, historical, infrastructure) and all opinions generated (marketing, publicity, sentiment, word of mouth) which, in turn, influences the tourist offer success. So the quantitative and qualitative perspectives are equally important in the analysis provided by a DSS.

The paper is organized as follows: Sect. 2 briefly reports the state of the art, Sect. 3 describes the proposed architecture and modules of the system, while some conclusions and ideas for future work are exposed in Sect. 4.

2 Literature review

In order to design an effective and efficient pro-active system, we have analysed the scientific and technological literature, related to the Decisions Support Systems and Recommender Systems in general, with a particular focus on the Tourist Recommender Systems case. Furthermore, we have reported the most important works related to the Tourist Tour Planner engines.

A Decision Support System (DSS) is defined as a computer-based system that supports the process of decision making (Finlay 1994); in general it can assume different forms, built by using different architectures and components (Eom et al. 1998). A properly designed DSS is, typically, an interactive software-based system supporting decision makers in the extraction of useful information from raw data, documents, personal knowledge, and/or business models for the identification and consequent solution of problems. A general taxonomy of DSS was described by Alter (1980). The author distinguished the DSS considering the type of analysis made by the system: file drawer systems, data analysis systems, information analysis systems, accounting and financial models, representational models, optimization models. Power (2002) proposed a different classification for DSS: Data-Driven DSS that emphasizes the manipulation of large databases of structured data (especially time-series), Model-Driven DSS that uses the manipulation of data through models (including financial models, representational models, and optimization models), Document-Driven DSS that integrates document analysis using a variety of storage and processing technologies, Communications-Driven and Group DSS that includes communication, collaboration and decision support technologies for allowing groups of decision makers to coordinate, Knowledge-Driven DSS that is a computer system with specialized problem solving expertise usually implementing data mining techniques. More details about the sub-category classification and technologies used can be found in Baggio and Caporarello (2005). The DSS spread out also in the touristic sector, with lots of examples described in Noor et al. (2010), Baggio and Caporarello (2005) and Mariani et al. (2018).

In the SMARTCAL project a Data-Driven DSS has been implemented in order to extract the required information from implicit preferences expressed by social network interactions and explicit preferences expressed by tourists and by social network interactions. The proposed DSS contains a specific module to build statistical tools and Key Performance Indicators for stakeholders and decision makers, allowing the quality and liking analysis of the touristic PEOI. The SMARTCAL DSS contains also a Tourist Recommender System module that supports the user to build optimized touristic tours based on the knowledge extracted on his own implicit and explicit preferences.

According with the literature, Recommender Systems (RSs) can be generally classified in three main categories: collaborative filtering, content-based and hybrid. Collaborative filtering (CF) recommendations are based on the similar preference of groups of people extracted in the past. The main approaches used in this case are memory based and model based, that implement the extraction of the knowledge through different Machine Learning (ML) techniques (Bjelica 2010), probabilistic Latent Semantic Analysis (pLSA) (Hofmann 2004; Yin et al. 2009), matrix factorization (Bauer and Nanopoulos 2014; Zhai and Li 2015), clustering and regression techniques (Nilashi et al. 2016)). For an in-depth overview the reader can refer to Sana et al. (2019). Content-based (CB) RSs (Pazzani and Billsus 2007) extract item features from descriptions and match them with the user profile’s preference in order to make a recommendation. Also in this case, a wide set of ML approaches could be implemented, such as Support Vector Machines (SVM) (Pronoza et al. 2016), Bayesian Networks (Pecli et al. 2015), Decision Trees (Alemeye and Getahun 2015), etc. Hybrid RSs are obtained by the combination of collaborative filtering and content-based approaches (Geng et al. 2015; Nguyen et al. 2016; Verma et al. 2016). More recently, the attention has been shifted to TRSs that have also focused on giving recommendations that are both relevant and serendipitous (Kotkov et al. 2016; Pan 2016; Guo et al. 2017).

Tourism is an important sector for the economic development so that TRSs spread out largely in the last years. A great number of TRSs are based on Multi-Agent Systems (MASs), a set of agents that interact, cooperate and coordinate with each other to reach desired goals and solve the problems. This type of TRS allows building high level models for heterogeneous systems and environments. The reader can find lots of examples of MAS-based TRSs in the literature are MARST (Bedi et al. 2014), PersonalTour (Lorenzi et al. 2011) and Turist@ (Batet et al. 2012). Other TRSs are focused on knowledge-based intelligent systems. They use ontologies to represent the domain knowledge, in order to enable reasoning processes. Some example are given by SAMAP (Castillo et al. 2008), SigTur (Moreno et al. 2013), SmartMuseum (Ruotsalo et al. 2013), GeOasis (Santiago et al. 2012) and e-Tourism (Sebastia et al. 2008). Most of these systems define the use generic ontologies or the integration of different ontologies in order to represent the domain knowledge. For an extensive review on TRSs, the reader can refer to Borràs et al. (2014). A Tourist Recommender System usually presents some functionalities related to the design of a personalised trip for the user, based on the explicit or implicit preferences of the tourist. This activity is usually developed by a system module well known as Tourist Tour Planner (TTP). Lots of applications can be found in the literature in this context: Ardissono et al. (2003) described a real application named INTRIGUE (INteractive TouRist Information GUidE) based on a fuzzy logic-based recommender system; Maruyama et al. (2004) introduced a personal navigation device that presents an engine for calculating tourist routes, extended by Kinoshita et al. (2006). A complete literature overview was presented in Souffriau and Vansteenwegen (2010). Further details about the SMARTCAL system are described in the following section.

3 SMARTCAL as a decision support system

The main features of the proposed system can be seen in Fig. 1. The users will access to all the services of the platform via web browser or mobile app. Services are classified based on the class of users they are addressed:

  • Citizen/Tourist services (shown in green). They refer to data consultation about topics and PEOIs, and to a Tourist Tour Planner, based on a Multi-modal route generator. In details:

    • Content browsing and search: this module allows the user to search for specific PEOIs to gather both the latest news and the direct expression of other users available in online social media;

    • Multi-modal route generator: this module supports the tourist tour planning engine building the distance matrix for moving from origins to destinations with different modalities (train, busses, walking, car,...);

    • Tourist tour planner: this module builds an itinerary for visiting a set of points of interest, considering the user preferences learned from the social networks analysis. Section 3.2 provides a detailed description of this service.

  • Manager services (shown in violet). They refer to all the services provided to the decision-makers and to the manager of the PEOIs registered in the system. In details:

    • Social media crawling: this component allows the collection of user posts in online social media that are related to a given PEOI. All the messages concerned with a given location or event is crawled by collecting all the streams mentioning each PEOI specified by the decision maker. This module is necessary for enabling the subsequent sentiment analysis module;

    • Sentiment analysis module: this module extracts knowledge about users polarity of different PEOIs by analysing social media contents in order to provide useful and timely insights to decision makers about a given location or event. A more detailed description is provided by Sect. 3.1;

    • Content management: this module supports the management of different data and contents inserted into the data base, in particular it updates, removes or changes information related to PEOIs and other data structures.

The main potentialities of the system were proposed to the different classes of users through a Web application for accessing services via a browser and a Mobile Application for a flexible use of services and contents via smartphone and tablet. From the tourists point of view, the Web Application and the Mobile Application will allow users to plan visits, considering both points of interest and events in the area, as well as the use of content during the visit itself (reading descriptive contents of places, works and events, viewing photos and listening to any audio guides). From the decision maker point of view, the Web Application presents an administrative panel for the management of contents and for the analysis and statistics of tourist flows, indeed the platform collects and conveys this information to the decision maker user.

Fig. 1
figure 1

Functional architecture

3.1 Sentiment analysis module

Nowadays, everything tends to be social and online therefore decision makers should take into account what people think about PEOI. Comments and interactions in online social networks need to be analysed in order to obtain useful insights and acquire new knowledge for planning and revising decisions (Pozzi et al. 2016). In the tourism sector, this is even more important since most of the users travelling on a given area, share their thoughts and feelings by spreading both positive and negative comments related to what they have experienced. The importance of dealing with contents and relationships pushed researchers to propose a novel paradigm called approval network (Fersini et al. 2017) aimed at capturing both constructualism and homophily for more realistic modeling of opinion sharing and contagion. Homophily is the principle stating that a contact among similar people occurs at a higher rate than among dissimilar people. On the other hand, constructuralism assumes that people who share knowledge are more likely to interact (creating ties) and take place for sharing knowledge instead of for socio-demographic reasons.

Approval networks have therefore introduced the possibility to model opinions in a relational environment, where users and messages are not independent of each other. In order to approach the problem of sentiment classification in online social networks, we propose to combine approval networks and ensemble learning by introducing a Relational Bayesian Modeling Averaging (RBMA) approach.

The aim of the proposed RBMA is to provide predictions based on several relational machine learning models. We consider C as the set of baseline relational classiers and D as the dataset with features and labels. Each er \(i \in C\) provides the probability P(l(s) |iD) for tuple s in dataset D to be classied with label l(s). RBMA, grounded on the Bayesian Model Averaging approach (Pozzi et al. 2013; Fersini et al. 2014), provides a weighted estimate based on the total probability theorem. In particular, the probability for tuple s to be labeled as l(s) is described in Baldi et al. (2020) and computed as:

$$\begin{aligned} P (l(s) | C, D)= \sum \limits _{i \in C} P (l(s) | i, D)P(i|D) \end{aligned}$$
(1)

In order to estimate P(l(s) |CD) we adopted a set of relational classifiers, which consists of a local (non-relational) classier and another classier which takes relations into account, provided by the NetKit (Macskassy and Provost 2007) toolkit.

In contrast to the relational learning paradigms, where the instances of both the training and the test set are mutually exclusive, in the social network environments a user instance could have messages belonging both to the training and testing set. This opens the problem of modelling new unlabeled data when they become available, constantly updating the underlying graph. The graph \(G={\{S, E\}}\) used by RBMA is composed of a set S of nodes that denote the users interacting in the social network and the approval relationships among them denoted by the set of edges E. Considering the continuous mutation of relations and opinions, the graph changes dynamically. In order to better grasp the dynamics of the social interactions, we provide the following example. If a user posts a comment on a social network (Facebook, Instagram, Twitter), it becomes a node in graph G. After few seconds that comments can be approved by other users, generating complex relations that add new edges into the graph. Moreover, a user that is present with one post in the training set, can create a new message to be classified as test set. According to this dynamic behaviour of the social networks, it is necessary to properly represent the graph for making use of the available labelled users and messages for training RBMA, and inferring the polarity of new users and posts when they became available.

To accomplish this task, a rst group of people is initially selected with their comments and relations for creating the training set on which the RBMA is trained. When new posts become available, we have two cases: (1) if a user, with its contents, appears for the first time in the graph, it is simply added to the test set creating eventual connections if it also approves some existing comments that belong to the training set; (2) if a user already belongs to the training set but it posts a new comment, it is added to the graph as testing users inheriting its existing approval relationships.

In this way, dependencies between the training set and the test set are created. In order to better explain the construction of the graph used by RBMA for training/testing and the importance of dynamic relationships, we introduce an example reported in Fig. 2. In the toy example denoted by the graph, Steve, Mary and John are users in the training set and existing approval relationships are denoted by the continuous arrows. These users are denoted by the suffix TR because they belong to the TRaining set.

Fig. 2
figure 2

Graph toy example: the training set components are described with the TR suffix and continuous arrows, while the test set components are described with TE and dashes arrows

Mary, that originally has approved one comment given by John when the training data have been collected, subsequently writes a new comment and approves two posts given by John. For what concerns Maria, we know her initial sentiment because she belongs to the group of people in the training set, but she is also present in the test set with the suffix TE. Other users than Mary belong to the test set, i.e. John and Cloe. For all the users in the test set, rounded by a dashed line, we need to infer their sentiment related to their new posts taking into account also some new and inherited relationships.

In particular, we want to allow users to change their sentiment over time but still exploiting their original sentiment evidence available in the training set. To this purpose, we add a dummy link from those users in the test set to the corresponding users in the training set (updating the overall number of approvals as weights). For those users that belong only to the test set, only approvals are introduced. The polarity prediction related to the users and their contents are exploited by the SMARTCAL platform for creating reliable and up-to-date sentiment indicators. In fact, by monitoring specific PEOI mentioned by the tourists, a dashboard reporting the user perception related to each destination could be created. A toy example of three monitored PEOIs is reported in Fig. 3.

Fig. 3
figure 3

Sentiment dashboard of PEOIs

Such indicators can be exploited by decision makers to understand the emotional experiential elements of tourist activities, in order to improve or maintain good satisfaction of the managed PEOIs or to identify new business development possibilities. On the other hand, the sentiment shared by other users can be taken into account by the tourists when selecting the PEOIs to be included in their travel plan.

3.2 Tourist tour planning module

In order to meet the expectations of the smart tourist, the Tourist Tour Planning module is based on an engine that offers optimised itineraries, according to the preferences and constraints expressed by the users. The generic problem of the customized generation of tourist trip has been defined as the Tourist Trip Design Problem (TTDP). This is a well-known NP-hard routing problem which has lots of practical applications (logistics, tourism and defence). The objective is the maximisation of the total score of the tour, taking into account different constraints related to total travel time and cost. The optimal solutions of the TTDP should select the best set of PEOIs to visit for the user and also plan the tour that can be travelled almost optimally. Different variants of the TTDP were studied in the literature, where authors introduced different parameters and constraints, compared to those of the generic problem presented. For a deeper literature overview, the reader can refer to Gavalas et al. (2014), Gunawan et al. (2016), Jriji et al. (2020). As said before, the TTDP is an NP-hard problem and so it cannot be solved in a polynomial computational time. Indeed, all existing online applications solve instances using efficient heuristic algorithms. The literature is full of approaches for simplified versions of TTDP, that usually focuses on a deeply studied class of problems, named as Orienteering Problems (OP). The aim of an OP is the definition of the optimal set of nodes to visit and their order, maximizing (or minimizing) a certain score function (Golden et al. 1987). Also the OP problem is usually solved with heuristics approaches due to his high complexity (Gendreau et al. (1998)).

In order to build an effective engine, we consider a set of PEOIs (divided into categories). It is supposed that each PEOI is featured by a particular ranking that depends on the sentiment analysis social network data, and also by a time window, an average time for the visit and a visit cost. We formulate a variant of the classical problem, that is an Orienteering Problem with Time Window, considering also additional constraints on the maximum budget imposed by the user, the duration of the tour and the time spent for each category of PEOIs. In particular, the last constraint is included for generating tours that are featured by a good trade-off between the all the categories available for the user (a tourist usually do not want to spend all the time visiting museum or churches, but it would like to enjoy also natural resources like parks). All the formulation details are described in Ciancio et al. (2018). To the best of our knowledge, the OP problems are approached with heuristic algorithms, considering the complexity and the high computational time required to solve them to optimality. For the SMARTCAL engine we introduce a heuristic algorithm for solving large instances in real life in an acceptable computational time, and obtaining good quality solutions. The major scheme we use for building the heuristic is related to a well known class of procedures: the genetic algorithm (GA).

The name of genetic algorithm is inspired by Charles Darwin’s theory of natural evolution, because its major scheme is based on the reproduction of the natural selection process. The algorithm usually maps the feasible solutions of the problem as data structure called chromosome and evaluates them by means of a function that considers the best features, called fitness function. At each iteration, the best chromosomes are selected for the generation of a new set of feasible solutions. This is the first parallelism with the Darwin’s theory, because only the fittest individuals survive for the reproduction of the next generation (Kinnear 1994). Furthermore, for evolving from a generation to another one, a series of basic patterns is defined: crossover that consists in the dividing two chromosome in different part and mixing these part each other, and mutation that consists in randomly changing a node of the solution in the chromosome (that also derives their names from the genetic science). The algorithm ends after a certain number of iterations or time limit are reached. All the details related to the algorithm and the experimental part are described in Ciancio et al. (2018). As illustrative example, we consider a test defined on a set of PEOIs situated into a city of the South of Italy, Reggio Calabria. The considered points of attraction belong to 4 categories: Museums, Parks, Entertainment, Places. The main PEOIs’ features are reported in Fig. 4.

Fig. 4
figure 4

Data input for the genetic algorithm

We assume that the tourist has indicated the maximum duration of the daily itinerary (8 h), the maximum budget (40 €), the maximum length admitted (120 km), the departure/arrival point, represented by the E-Hotel located in the city center. The connections representing the arcs of the graph used in the mathematical formulation are assumed to be known and the corresponding costs and times have been previously determined. The tour obtained by the algorithm is described in Fig. 5 that also reports the time required to move from one point to the next one.

Fig. 5
figure 5

Genetic algorithm output

4 Conclusions and future work

In this work we have presented a Decision Support System for the tourism sector, developed within a R&D project (SMARTCAL), which aims at providing a useful tool for both users and decision-makers in this field.Using a sentiment analysis module based on RBMA approach can provide useful managerial insights for the planning and the management of the PEOIs and of supporting infrastructures and services. On the other hand, it acts as a Tourist Recommender System for the definition of optimal tourist routes, which fit implicit and explicit user’s preferences, by means of the combination of a social network user profiling module and an optimization engine based on an efficient and effective heuristic algorithm.

The R&D project is still ongoing and, thanks to a continuous interaction with the stakeholder of the sector, some functionalities will be strengthened, like for example the link with new smart mobility systems. In conclusion, the proposed system presents a great potential for the tourist engagement, a high level of scalability and also a good opportunity for business development. Moreover, it can be seen also as a useful managerial tool for decision-makers in this field, since it can allow having a real-life perception of end-users perspectives and preferences.