Keywords

1 Introduction

The “second railway age” is brought about by significant advancements in transportation technology and the ongoing building of High-Speed Rail (HSR) [1, 2]. China's HSR network has expanded significantly during the past ten years, affecting the geographical organization of cities within the transportation system.

The longest HSR in the world is, in fact, in China (see Fig. 1). The total operating mileage of the national railroad in 2021 was higher than 150,000 km, including 40,000 km of HSR. Additionally, 2168 km of new mileage was built and put into service in 2021. From 28% in 2012 to 93% in 2021, the HSR network's coverage of cities with a population of more than 500,000 people increased. Except for Lhasa on the mainland, all province capital cities have been connected to the HSR. In August 2020, China National Railway Group Co., Ltd. issued and released the Outline of Railway Development Plan for A Powerful Transportation Country, defining the development blueprint of China Railway in the next 15 years and 30 years [3]. According to the plan, in 2035, China will have built about 200,000 km of the national railway network, including about 70,000 km of HSR. The HSRs mileage will be double-sized w.r.t. the year 2019, and HSR will serve all provincial cities and cities with a population of more than 500,000. The 3-h HSR circle is basically realized between the provincial capitals in adjacent regions (http://www.xinhuanet.com/fortune/2020-08/14/c_1126366741.htm).

Fig. 1
A map of China presents the longest high-speed rail network. All the capital cities except Lhasa have the H S R connection. It highlights all high-speed rail lines and railways.

The High-Speed Rail network in China

The main focus of this study is the investigation of the elements influencing tourists’ decisions for the case study of thirty Chinese provinces, where the influence of HSR has been investigated. There are contributions in the literature that primarily focus on the accessibility and mobility impacts of HSR in China [4,5,6], as well as the influence of HSR on the growth of regional tourism [7] and foreign visitors [8]. Although extensive studies have examed the impact of HSR on tourism in China, the findings are not always consisistent, as they discovered significant effects in some instances [9, 10] and insignificant effects in others [11]. For example, Chen and Haynes [8] demonstrated that the Chinese provinces served by HSR experienced an increase in the number of foreign tourists of 20% and an increase in tourism revenue of almost 25%. In addition, Chen et al. [12] evaluated the spatial impacts of HSR on domestic tourism demand in China using spatial econometric analysis for the period 1999–2016. Their study confirmed that HSR has diverse spatial impacts on tourism output, with a particularly strong effect in the less developed west regions, moderate impact in the central region, and less significant in the developed east regions.

The effect of HSR on tourism in the Yangtze River Delta was analyzed by Taotao et al. [13]. The Yangtze River Delta's development in regional tourism demonstrated a “HSR effect,” and the demand and supply of tourism-related goods significantly improved. Yuhua and Jun [14] demonstrated how the HSR's introduction impacted the growth of tourism in the cities it served.

After studying Huangshan City's tourist spatial structure before, immediately after, and two years after the HSR's inauguration, Lei et al. [15] concluded that the HSR's inauguration had little to no effect on Huangshan City's tourism.

An additional investigation revealed that the expansion of the HSR network in the HSR also affected tourist flows and spatial relationships of the two cities, Beijing and Tianjin [16, 17].

Ziyang et al. [18] used Xiamen City as a case study and confirmed a relationship between HSR and tourism. According to Yongze et al. [19], the inauguration of the HSR had a substantial impact on encouraging regional tourism. Still, as the country moved from the east to the west, this influence gradually diminished.

The impact of high-speed rail on tourism development can be also predicted through random utility models (RUM). A study on their predictive capability in terms of market share was conducted by [20]. Travellers’ and transport users’ preferences can be inferred from different data sources, such as trip diaries or trajectories (e.g. [21]).

In this chapter, a spatio-temporal analysis has been adopted to evaluate the variables affecting tourists ‘choices and, specifically, the impact of HSR on both Chinese domestic and foreign tourists.

The two methods were adopted. The GWPR modelling approach was first chosen. The latter considers the problem of temporal and spatial autocorrelation in a different way with respect to the Generalized Estimating Equations method. Specifically, the results of this study support the use of the Geographically Weighted Regression with Poisson distribution (GWPR) as a useful tool for tourism planning, since it makes possible to model non-stationary spatially counting data. This methodology, as far as the authors know, has never been applied in the international literature to this context.

Secondly it takes into account a further analysis which combines both the temporal autocorrelation and spatial-autocorrelation by the application of models of Geographical and Temporal Weighted Regression (GTWR) types in order to take into account also the local effects from the temporal point of view.

The chapter is organized as follows. Section 2 deals with the description of the data set and the methodology. In Sect. 3, the results are reported. Conclusions and further perspectives are reported in Sect. 4.

2 Description of the Data Set and the Methodology

The dataset collected for this study contains information concerning thirty-four Chinese provinces. Hong Kong, Macao, Taiwan and Tibet are excluded from the dataset due to a data limitation. In total, the data covers the period from 2001–2019 (see Fig. 2).

Fig. 2
A map of China presents the data set. The data is collected for all provinces, municipalities, autonomous regions and not analyzed for Xizang and Taiwan.

Provinces and regions under study

As shown in Table 1, eight variables were adopted in this evaluation.

Table 1 Variables

The impacts of HSR projects on tourism can be quantified in different ways.

In this study, the dependent variables take only non-negative integer values, the statistical treatment differs from that of the normally distributed one, which can assume any real value, positive or negative, integer or fractional. Count data can be modeled using different methods, the most popular is the Poisson distribution, which is applied to a wide range of transportation count data contexts. In a Poisson regression model, the probability of city i having \({y}_{it}\) number of tourist per year is given by [22,23,24]:

$$P(y_{i} )\, = \,\frac{{\lambda_{i}^{{(y_{i} )}} \times e^{{ - \lambda_{i} }} }}{{y_{i} !}}$$
(1)

where \(P\left({y}_{i}\right)\) is the probability of city i having \({y}_{i}\) tourist per year and \({\lambda }_{i}\) is the Poisson parameter for city i, which is equal to the expected number of tourists per year at city i, E[\({y}_{i}\)]. The mean and the variance are given by E[\({y}_{i}\)]= \({\lambda }_{i}\) and V[\({y}_{i}\)] = \({\lambda }_{i}\). . Generalized Linear Models (GLMs) are considered the most suitable to determine the relationship between count data and the dependent variables. GLMs aim to extend ordinary regression models to non-normal response distributions [25, 26]. Furthermore, the data considered involve measurements over time for the same cities, to avoid the serial correlation seriously affecting the estimated parameters, leading to inappropriate statistical inferences. Therefore, the panel data regression models have been considered. Panel model analysis provides a general, flexible approach in these contexts, since it allows the modeling various correlation patterns. To consider these possible unknown correlations, an extension of GLMs, namely Generalized Estimating Equations (GEEs), has been considered. The relationship between the explanatory variables and the Poisson parameter is given by [27]:

$$E[y_{it} ] = e^{{\left( {\beta_{0} + \beta_{1} x_{1} t + \beta_{2} x_{2} t + \cdots + \beta_{p} x_{p} t + \Phi y_{(it - 1)} + u_{i} t} \right)}}$$
(2)

where \({\beta }_{0}\) is the intercept, the \({\beta }_{p}\), i = 0,1,…,p, are the regression coefficients, ϕ is the parameter for the autoregressive component and \({u}_{it}\) is the error component model for the disturbances. The main problem is that the \({u}_{it}\) is auto-correlated with \({y}_{it-1}\). In order to fix this, the model is fitted by using population-averaged

Poisson models and by imposing an AR(1) process in the error term. These models are suitable when the random effects and their variances are not of inherent interest, as they allow for the correlation without explaining its origin. The aim is to estimate the average response over the population rather than the regression parameters that would enable the prediction of the effect of changing one or more components of the predictor variable on a given individual.

The parameters of this model are estimated by a backward elimination procedure, which considers all the variables in the model. At each step of the backward process, a variable is removed. The latter is the one assuming the largest p-value. The process ends when all the variables in the model have a p-value less than 0.05 or until there is no variable remaining [25].

The significance of each variable has been tested with the t-student statistic, therefore, a coefficient is significant when t is greater than 1.96.

Then, the Geographically Weighted Generalised Linear model (GWGL) was developed by integrating the GLM and the GWR ones and extending the concept of the Geographical Weighted Regression (GWR) models in the context of the Generalized Linear Models (GLM). Given that the dependent variables are count data with discrete and non-negative integer values, GWR models have been performed using the Poisson distribution error [28].

The Geographically Weighted Poisson Regression (GWPR) approach has been adopted [29] to capture the heterogeneity of the independent variables concerning each province. These models capture the spatial variation by fitting a regression model at each sample point. The result of this process is a set of local spatial parameters, described in Eq. (3).

$$E[y_{i} ] = e^{{\left( {\sum_{p} \beta_{jp} \left( {u_{j} ;v_{j} } \right)x_{jp} } \right)}}$$
(3)

where \(\left({u}_{j};{v}_{j}\right)\) are the coordinates of the different areas, \({\beta }_{jp}\) represents the regression coefficient for the independent variable \(p\) and \({x}_{jp}\) is the independent variable with \(p=1,\dots ,P\). The basic idea of the GWR is that the observed data next to point \(i\) has a higher influence on the estimation of \({\beta }_{j}({u}_{i})\)’s than the data located further away. A weighting function describes this influence. GWR tries to capture the spatial variation by adjusting a regression model to each point individually and using a distance function denominated kernel spatial function. Models have been estimated yearly to capture the variability over time and space.

Lastly, an extension of geographically weighted regression (GWR), geographical and temporal weighted regression (GTWR), is developed to account for local effects in both space and time [30]. The result of this process is a set of local spatial parameters, described in Eq. (4).

$$E[y_{i} ] = e^{{\left( {\sum_{p} \beta_{jp} \left( {u_{j} ;v_{j} ;t} \right)x_{jp} } \right)}}$$
(4)

where similar to the GWPR model, \(\left({u}_{j};{v}_{j}\right)\) are the coordinates of the different areas with the addition of the term t to indicate the dependence on the dimension time, \({\beta }_{jp}\) represents the regression coefficient for the independent variable \(p\) and \({x}_{jp}\) is the variable value.

3 Results

The estimation results of GLM are reported in Tables 2 and 3. The independent variable, that is not statistically significant at the 0.5 level of significance were removed from the models.

Table 2 GLM model: domestic tourist
Table 3 GLM model: overseas tourist

The GLM models’ results show that both the number of domestic and foreign tourists are influenced by the presence of HSR stations, the presence of IntAirport, GDP, and the Number of Passengers.

The results of the GEE models, reported in Tables 4 and 5, confirm the results obtained by the GLM models. However, GEE models are more conservative than GLMs. A higher standard error of the GLM can be observed, moreover, also the value of the log-likelihood is lower, indicating a greater ability to estimate the models.

Table 4 GEE model: domestic tourist
Table 5 GEE model: overseas tourist

Starting from the statistically significant variables obtained from the GLM and GEE models, the GWPR models have been estimated for every year. An example of the results obtained, for the years 2001, 2010 and 2019 for Domestic tourists and Foreign tourists, is shown in Tables 1 and 2, respectively, where the minimum, the maximum, the first quartile, the median, the third quartile and the global values are reported (Tables 6 and 7).

Table 6 GWPR model: domestic tourist
Table 7 GWPR model: overseas tourist

In particular, for each year, it is possible to observe a variability of coefficients, indicating that the impact of this variable is not the same for each province and in the time. It generally refers to a diversified mixture of spatial events, which relates to the intensity of a spatial phenomenon.

Looking at the global value of the HSR coefficient, it is interesting to notice, in general, an increase from 2001 to 2019 for both Domestic and Foreign tourists. Observing the values of the HSR coefficient for the provinces of Beijing, Hainan, Hebei, Heilongjiang, Hubei, Qinghai, and Shandong, an increase is observed starting from 2013 (see Figs. 3 and 4).

Fig. 3
A line graph plots domestic H S R stations versus years from 2001 to 2019. The curves for Beijing, Hainan, Hebei, Heilongjiang, Hubei, Ningxia, Qinghai, and Shandong start increasing in 2009 and have a sharp increase from 2013.

The impact of HSR on eight provinces—GWPR: domestic tourists

Fig. 4
A line graph plots overseas H S R stations from 2001 to 2019. The curves for Beijing, Hainan, Hebei, Heilongjiang, Hubei, Ningxia, Qinghai, and Shandong decrease from 2009, slightly increases from 2012, have a sharp increase in 2016, and finally decline.

The impact of HSR on eight provinces—GWPR: overseas tourists

The results of the three years have been presented, i.e. the 2013, 2015 and 2019 for the Chinese tourists. The results in the years before 2013 are reported since the coefficients are not significant.

Indeed, in Figs. 5 and 6, the weight that the coefficient of the variable HSR has on each province is reported, i.e. the objective is to demonstrate the effect of HSR of the neighboring provinces on their tourism.

Fig. 5
3 maps of China for domestic tourists have a color code for H S R station values to the right. In 2013 and 2015, the provinces of China have very low and moderate coefficients of H S R, respectively. In 2019, the provinces have high coefficients with very high values for Hunan, Jangxi, and Hubei.

The coefficients of the variable HSR for each provinces—GWPR: domestic tourists

Fig. 6
3 maps of China for overseas tourists have a color code for H S R station values to the right. In 2013 and 2015, the provinces of China have low coefficients of H S R. In 2019, all the provinces have high coefficients.

The coefficients of the variable HSR for each provinces—GWPR: overseas tourists

It appears that central provinces, such as Hubei, Chongqing, Jiangxi and Hunan, have experienced more substantial impacts from HSR on domestic tourism, while the HSR's impact on international tourism is relatively evenly distributed among all the provinces.

The results of the GWTR essentially confirm the results of the GWPR models (Tables 8 and 9).

Table 8 GWTR model: domestic tourist
Table 9 GWTR model: overseas tourist

However, being more conservative than the GWPR models, a lower spatial and temporal variability is observed (Figs. 7, 8, 9 and 10).

Fig. 7
A line graph plots domestic H S R stations versus years from 2001 to 2019. The increasing lines for Beijing, Hainan, Hebei, Heilongjiang, Hubei, Ningxia, Qinghai, and Shandong start between 0.8 and 1 on the y-axis. The highest estimated value of 1.2 is for Qinghai in 2018.

The impact of HSR on eight provinces—GWTR: domestic tourists

Fig. 8
A line graph plots overseas H S R stations versus years from 2001 to 2019. The lines for Beijing, Hainan, Hebei, Heilongjiang, Hubei, Ningxia, Qinghai, and Shandong start between 0.2 and 0.4 on the y-axis. The highest estimated value of 0.5 is for Ningxia in 2019.

The impact of HSR on eight provinces—GWTR: overseas tourists

Fig. 9
3 maps of China for domestic tourists have a color code for H S R station values to the right. In 2013, the provinces of China have low and moderate coefficients of H S R. In 2015 and 2019, the provinces majorly have moderate coefficients.

The coefficients of the variable HSR for each provinces—GWTR: domestic tourists

Fig. 10
3 maps of China for overseas tourists have a color code for H S R station values to the right. In 2013, 2015, and 2019 all the provinces of China have low coefficients of H S R.

The coefficients of the variable HSR for each provinces—GWTR: overseas tourists

4 Conclusion

In this chapter, we provided a spatio-temporal analysis using two advanced methods, the Weighted Regression with Poisson distribution (GWPR) and the Geographical and Temporal Weighted Regression (GTWR) model, to evaluate the spatial impact of HSR on tourist behavior. Using the Chinese HSR system as an example, our study confirms that the impact of HSR varies both spatially and temporally. The results suggest that the impacts of HSR on tourism in China have increased constantly since 2013. In terms of the spatial impacts, central provinces, such as Hubei, Chongqing, Jiangxi and Hunan, have experienced more substantial impacts from HSR on domestic tourism, while the HSR's impact on international tourism is quite relatively evenly distributed among all the provinces.

Overall, the study reveals some consistent patterns as Chen et al. [12], which adopted spatial econometric models. For instance, the impact of HSR on domestic tourism is found to be relatively strong in central provinces. Such a result suggests that HSR system tends to enhance the attractiveness of central regions and promote tourism due to improved regional accessibility and connectivity.

Future transport infrastructure project evaluation should consider adopting more advanced spatial modeling techniques, such as spatial weighted regression models and spatial econometric models, to capture both the spatial–temporal variation of impacts as well as the spatial dependence of impacts. Only a full understanding of the spatial and temporal impacts of the system may provide sound implications to guide future planning and policy decision-making.