1 Introduction

The aim of this paper is to combine the systemic complexity of a spatial interaction model (SIM) with the hedonic prices methods to evaluate urban complex scenarios. The results from SIM are mainly expressed in quantitative terms and the bid-rents are often seen as model parameters and not related to the actual value of the spatial constraints. There are various methods to model spatial variations in housing prices: location dummies, distance gradients, accessibility and interactive variables approaches (Fik et al. 2003). Nevertheless, to our knowledge, none of them consider the capacity of calibrated bid-rents of SIMs as estimators of housing prices. The advantages to combine partial equilibrium hedonic price models with general equilibrium SIM is to establish consistent links between, on the one hand, spatial indicators of activities and flows and, on the other hand, real estate values at the urban scale providing reliable multidimensional results for the multi-criteria analysis of the decision makers (Nijkamp et al. 2003).

SIMs are built to describe and predict the flow of people, goods and information across space (Sen and Smith 1995). Spatial interactions analyses are reported from a long time in the literature (Carey 1858; Reilley 1932; Stewart 1948; Carrothers 1956; Schneider 1959). These studies have contributed to the development of analytical tools that are commonly used in land planning, geography and regional science (Wilson 1967; Isard 1975; Batty 1976; Anderson 1979; Haynes and Fotheringham 1984; Fotheringham and O’Kelly 1989; NijkaMillonen and Luoma 1999), transports (Hyman 1969; Evans 1971, 1976; Erlander and Stewart 1990), commerce and marketing (Huff 1964; Bergstrand 1985; Deardorff 1998), and demography migration. Concerning land use, there are many disciplinary approaches and Silveira and Dentinho (2010) made a survey of the different type of explanatory models of land use. An expansive review of these theoretical bases with a larger scope on various economic fields is performed by Roy (2004). Moreover, new improvements in the field of SIMs are always expected due to the rampant development of computer hardware over the years, which ultimately allow a broader and faster integration of societal and ecological complexities (Irwin and Geoghegan 2001; Wilson 2010; Echenique et al. 2011).

The hedonic price method (HPM) is one of the revealed preferences methods (Pearce et al. 2006) and estimates changes in the value of a non-market good—for example the change in value associated with a particular policy—by observing the behavior in the market for a related good (Bertrand and Fransoo 2002; Arraes and Filho 2008).

In the present paper the bid-rents of the SIM are related to house prices through a HPM; therefore, the quantitative estimates of the SIMs become consistent with value estimates of HPM. Through the development of appropriate calibrating process of the bid-rents of the SIM, it become possible to estimate changes in the value of property associated with the specific changes in the urban system such as employment, accessibilities, housing and land use policies.

The analysis is applied to Terceira Island in the Azores Archipelago which has \(400\,\hbox {Km}^{2}\) and 56,437 residents with an economy based on daily products and beef exports, unilateral public transfers, tourism, fishery and Lajes Military Airfield (Haddad et al. 2015). The issue is to know the impact of the current Lajes Field drawdown, which is a real scenario as it is currently happening in Terceira, to prove the usefulness and effectiveness of the combine SIM/HPM methodology.

In the current paper structure, chapter 2 presents the paper methods. In chapter 3, the results are exposed and discussed and finally, in point 4, some conclusions are expressed.

2 Methods

2.1 SIM formulation

The formulated SIM is a gravity-based Lowry-type model which generates urban population and service employments distributions on the zones of a region according to their average residence-employment and population-services distances. This model was transformed later to become a nonlinear mathematical problem in which dual variables on constraints can be interpreted as rents (Wilson 2010). The model considers that exports (basic activities—\(E_\mathrm{b})\) are the propulsive factors of the economy that depend on external influences; non-basic employment (\(E_\mathrm{nb})\) refers to employment that serves directly the local population. The sum of \(E_\mathrm{b}\) and \(E_\mathrm{nb}\) gives the total employment (E).

The model assumes that the spatial interaction \(T_{ij}\) between one origin i and one destination j from a set of m zones is positively related with the attraction V / W on destination j (\(V_{j} /W_{j})\) and negatively related with the distance between them (\(d_{ij})\). A higher value \(W_{j}\) on a specified zone signifies that the attraction (\(V_{j} /W_{j})\) must be reduced to sustain a higher demand on that zone; \(W_{j}\) reflects ultimately higher real estate values and is related to the value of the bid-rent. Notice that V is introduced to provide scale to zones with different dimensions.

Considering a specific zone i and an interaction with zone j from a set of m zones, the interaction between i and \(j T_{ij}\) is given by:

$$\begin{aligned} T_{ij} =E_{k} \frac{V_{j} /W_{j} .\mathrm{e}^{-\alpha {d}_{ij} }}{\mathop {\sum }\nolimits _{j=1}^m V_{j} /W_{j} .\mathrm{e}^{-\alpha {d}_{ij} }} \end{aligned}$$
(1)

For all zones ij, and the population \(P_j \) comes:

$$\begin{aligned} P_{j} =\mathop {\sum }\limits _{i=1}^{m} r.T_{ij} \end{aligned}$$
(2)

where \(\hbox { T}_{{ ij}}\) is the commuter that works in zone i and lives in zone j, \(\hbox { E}_{{ i}}\) is the employment of zone i, r is the inverse of the activity ratio (total population/total employment of the region), \(\hbox { V}_{{ j}}/\hbox { W}_{{ j}}\) is the attraction of zone j, \(\alpha \) is the parameter that defines the friction produced by distance for the commuters, \(d_{ij}\) is the distance between zone i and j, and \(P_{j }\)is the population in zone j.

The activities generated for each zone \(i \, S_{ji}\) that serve the population \(P_j\) that lives in all the other zones within a service range:

$$\begin{aligned} S_{ji} =P_{j} \frac{V_{i} /W_{i} .\mathrm{e}^{-\beta {d}_{ij} }}{\mathop {\sum } \nolimits _{i=1}^{m} .V_{i} /W_{i} .\mathrm{e}^{-\beta {d}_{ij} }} \end{aligned}$$
(3)

For all zones i, and the Employment \(E_i \) comes:

$$\begin{aligned} E_{i} =\mathop {\sum }\limits _{j=1}^{m} s.S_{ji} \end{aligned}$$
(4)

where \(S_{jl}\) is the activity generated in zone I that serves the population in zone j\(\hbox { P}_{j}\) is all the residents in zone j, s is the non-basic activity ratio \((\hbox { E}_{\mathrm{nb}}/\hbox {Population})\), \(\hbox { V}_{{ i}}/\hbox { W}_{{ i}}\) is the attraction of zone i, \(\beta \) is the parameter that defines the friction produced by distance for the people that look for activity services, \(\hbox { d}_{{ ij}}\) is the distance between zone i and j and \(\hbox { E}_{{ i}}\) is the employment of zone i. Then, defining the elements of the matrices [A] and [B] as:

$$\begin{aligned} \left[ {A_{ij} } \right]= & {} \frac{r.V_{j} /W_{j} .\mathrm{e}^{-\alpha {d}_{ij} }}{\mathop {\sum }\nolimits _{j=1}^{m} r.V_{j} /W_{j} .\mathrm{e}^{-\alpha {d}_{ij} }} \end{aligned}$$
(5)
$$\begin{aligned} \left[ {B_{ji} } \right]= & {} \frac{s.V_{i} /W_{i} .\mathrm{e}^{-\beta {d}_{ij} }}{\mathop {\sum }\nolimits _{i=1}^{m} s.V_{i} /W_{i} .\mathrm{e}^{-\beta {d}_{ij} }} \end{aligned}$$
(6)

The endogenous variables (\(P_{j }\) and \(E_{i})\) can be obtained from the exogenous variable \(\hbox { E}_{\mathrm{b}}\) through the use of matrices [A], [B] and the identity matrix \(\hbox { I}_{\mathrm{M}}\):

where

$$\begin{aligned} \left[ E \right]= & {} {\left\{ {I_\mathrm{M} -\left[ B \right] \left[ A \right] } \right\} }^{-1}.\left[ {E_\mathrm{b}} \right] \end{aligned}$$
(7)
$$\begin{aligned} \left[ P \right]= & {} {\left\{ {I_\mathrm{M} -\left[ B \right] \left[ A \right] } \right\} }^{-1}.\left[ {E_\mathrm{b} } \right] \left[ A \right] \end{aligned}$$
(8)

To secure that the residence-employment and population-services average costs from the model are equal to the real average costs, the model is iteratively calibrated for parameters \(\alpha \) and \(\beta \) until the predicted average costs are similar to the real average costs. Finally, \(V_{i}/W_{i}\) values are iteratively calibrated to guarantee the accomplishment of constraints that the demand for space in each zone is lower or equal to the space available. Notice that, in this model, residents and employees compete for the same space in each zone.

The \(V_{i}/W_{i}\) calibrated attraction values can also be interpreted as bid-rents (Roy and Thill 2004). The bid-rent (\(\omega _{i})\) is complementary to the transportation costs and is given by the formula:

$$\begin{aligned} {\omega _{i}} =-\left( {\ln \frac{1}{\left( {\frac{V_{i} }{W_{i} }} \right) }} \right) \end{aligned}$$
(9)

Therefore, Eqs. (7) and (8) can be mathematically expressed as (10) and (11), respectively:

$$\begin{aligned} \left[ {A_{ij} } \right] =\frac{r.V_{j} .\mathrm{e}^{\omega _{j} -\alpha {d}_{ij} }}{\mathop {\sum }\nolimits _{j=1}^{m} r.V_{j} .\mathrm{e}^{\omega _{j} -\alpha {d}_{ij} }} \end{aligned}$$
(10)
$$\begin{aligned} \left[ {B_{ij} } \right] =\frac{s.V_{i} .\mathrm{e}^{\omega _{i} -\beta {d}_{ij} }}{\mathop {\sum }\nolimits _{i=1}^{m} s.V_{i} .\mathrm{e}^{\omega _{i} -\beta {d}_{ij} }} \end{aligned}$$
(11)

2.2 SIM development and implementation in MATLAB

The SIM described in the section above was coded and integrated in MATLAB 2013a (Mathworks, Natick, United States). The developed tool is user friendly, unlocks the spatial constraints regarding matrix operations with previously used software, speeds up the iterative processes, allows the outputs to be exportable to spreadsheet format, and supports model calibration and scenario simulation functions. The spatial interaction model flowchart for both calibration and simulation is illustrated in Fig. 1.

For the calibration phase, some of the necessary inputs are manually inserted in an Excel standardized workbook which was previously pre-defined and integrated with MATLAB. For each zone, the user must insert the data for zone name, \(\hbox {E}_{\mathrm{b}}\), space for population and space for employment.

The distance matrix and r and s parameters are also inserted in the Excel template workbook. The r and s parameters are calculated by the following formulas:

$$\begin{aligned} r= & {} \frac{\mathrm{Total\,Population}}{\mathrm{Total\,Employment}} \end{aligned}$$
(12)
$$\begin{aligned} s= & {} \frac{\left( {\mathrm{Total\,Employment-Total\,Basic\,Employment}} \right) }{\mathrm{Total Population}} \end{aligned}$$
(13)

Afterward, the program is initiated and the excel workbook is loaded from the database into MATLAB environment. Next, the user is prompted to make optional modifications in the some data. Other inputs are inputted manually, which include friction parameters (\(\alpha \) and \(\beta \)), average distance costs for both jobs and services, maximum number of iterations \((I_\mathrm{max})\) and required tolerance to stop the iterative cycle \((\mathrm error_{t})\).

Fig. 1
figure 1

Model calibration and simulation processes in MATLAB

The attrition parameters \( alpha\) and \( beta\) are adjusted by Hyman’s calibration method (Hyman 1969). For a hypothetical parameter \(\upgamma \) and iteration I:

$$\begin{aligned} \gamma _{x+1} =\frac{\left[ {\left( {C_\mathrm{real} -C_\mathrm{estimated} \left( {\gamma _{I-1} } \right) } \right) .\gamma _{I} -\left( {C_\mathrm{real} -C_\mathrm{estimated} \left( {\gamma _{I} } \right) } \right) .{\gamma _{I-1}} } \right] }{\left[ {C_\mathrm{estimated} \left( {\gamma _{I} } \right) -{C}_\mathrm{estimated} \left( {\gamma _{I-1}} \right) } \right] } \end{aligned}$$
(14)

With \(\hbox { C}_{\mathrm{real}}\) being the real average costs and \(\hbox { C}_{\mathrm{estimated}}\) being the model estimated average costs. The optimum stop condition is activated if the absolute value of both the differences of average costs is lower than the \(\mathrm{error}_{t}\) parameter previously defined:

$$\begin{aligned} \left| {EC_\mathrm{e} -C_\mathrm{e}} \right| \Lambda \left| {EC_\mathrm{s}-C_\mathrm{s} } \right| <\mathrm{error}_{t} ;\mathrm{stop} \end{aligned}$$
(15)

With \(\hbox { EC}_{\mathrm{e}}\) being the estimated average commuting cost, \(\hbox { C}_{\mathrm{e}}\) is the real average commuting cost, \(\hbox { EC}_{\mathrm{s}}\) is the estimated average cost for the population to access services in a specific zone, \(\hbox { C}_{\mathrm{s}}\) is the real average costs for the population to access to services in a specific zone and \(\hbox {error}_{\mathrm{t}}\) and maximum tolerance to end the iterative cycle.

To prevent an infinite cycle regarding non-convergence, a maximum number of iterations is defined at the start. If the loop does start to exhibit chaotic behavior, the iterations are automatically stopped.

When the maximum number of iterations \((I_\mathrm{max})\) is lower than the current iteration, or when the Eq. (15) is achieved, the program outputs the calculated data to a spreadsheet compatible format for further numerical and graphical examination of the given results in compatible spreadsheet and statistical packages. Subsequently, we can save the calibrated model to a MATLAB-specific format and store it in the database in order to use the saved data to perform scenario simulation.

After we have the calibrated model, we can load it on the program to perform a scenario simulation methodology to simulate scenarios and how they can impact the studied region of the calibrated model in terms of population and employment distribution, average costs, and bid-rents.

The process starts with the loading of a previous calibrated model. Afterward, the user has the possibility to perform input modifications for scenario simulation in \(\hbox {E}_{\mathrm{b}}\), available space and distance matrix variables. \(\alpha \) and \(\beta \) are considered static as we consider no change in the region global friction, and the real average costs are the calibrated model estimated costs. Next, the user inputs the maximum number of iterations \((I_\mathrm{max})\) and the tolerance to stop the iterative process \((\mathrm{error}_{t})\), and the iterative process starts. A stop counter equal to 0 is initialized with the beginning of the process.

The iterative process stops differently in the scenario simulation procedure. As the costs may change due to possible input modifications, convergence might occur for different estimated costs. Due to this, the function considers current and last iteration difference between the model calibration costs and the estimation costs to achieve convergence:

$$\begin{aligned}&\left| {EC_{\mathrm{e},I} -EC_{\mathrm{e},I-1} } \right| \Lambda \left| {EC_{\mathrm{s},I} -C_{\mathrm{s},I-1} }\right| <\mathrm{error}_{t} ;\mathrm{Stop\,Counter} ++1 \end{aligned}$$
(16)
$$\begin{aligned}&\quad \mathrm{Stop\,Counter}=>10;\mathrm{stop} \end{aligned}$$
(17)

Therefore, if the difference between these costs in both estimated costs for the current iterations and the previous iterations are both below \(\mathrm{error}_{t}\), the program adds 1 to the stop counter. Otherwise, the stop counter is reset. If Eq. (17) is achieved, the program stops and assumes that a convergence has been achieved.

When the maximum number of iterations is lower than the current iteration, or when the Eq. (17) condition is achieved, the program outputs a new spreadsheet that compares the model calibrated data with the new scenario simulation data, which is useful for intuitively analyzing the possible differences in both calibration and simulation outputs.

2.3 Terceira Island and Lajes Field—data and characterization

For the case study of this paper, we will consider Terceira Island, which in one of the nine islands that constitute the Archipelago of the Azores which belongs to Portugal. The island has, of the most recent data, a population of 56,437—which is the second highest of the Azores right after the S. Miguel Island—and is constituted by two municipalities and 30 parishes. Population, total employment, employment per economic activity and housing data for each of the thirty parishes on Terceira Island was collected from CENSOS 2011 from the Portugal National Institute of Statistics website. Each parish was considered as a zone for the model. Adding to these zones, Lajes Field is also considered as a zone for a total of 31 zones to be included in the model. Lajes Field is a multi-use airfield that besides being a home to the Portuguese Air Force and a regional air passenger terminal, is a US Air Force detachment unit which is operated by the 65th Air Base Wing of US Air Forces in Europe. Due to being a US detachment location since 1979, there is a significant ratio of US military and their families in the island and they have given a very important contribution to the economy of Terceira. In 14th December 2012, it was announced that US presence in Lajes Field would be reduced from the 1800 to 900 due to the drawdown of 400 military personnel and 500 family members, plus a reduction in Portuguese personnel of 750 which impacts are assessed in the model. Those force-structure changes at Lajes Field are estimated to generate savings of $35 million annually for the USA (Lajes Field 2012). However, this could significantly affect the economy of Terceira Island and Azores. Figure 1 illustrates Terceira Island divided by their parishes and including Lajes Field location as well.

Basic employment data were estimated according to Haig (1974), by deducting the minimal percentage of jobs per sector from the local percentage of jobs per sector in each zone. Average distances (or costs) between residence-employment and population—and services interactions were also externally deduced by the weighted sum of the average distance (measured in minutes) travelled by workers and population when going to their jobs and services inside or outside each zone. The average distances for residence-employment and population–services interactions are 7.45 and 7.09 min, respectively (Fig.  2).

Data respective to house pricing and house typology (T1–T5) were collected from real estate selling services that operate in Terceira Island. Data for 91 houses were collected for different typologies (T1 = 12; T2 = 26; T3 = 25; T4 = 20; T5 = 8) and for the 31 zones.

2.4 Model calibration and simulation

First, the model for Terceira Island was calibrated with a considered maximum tolerance to end the iterative cycle of 0.01 and \(\hbox { I}_{\mathrm{max}}\) was equal to 100. The considered operational inputs (r) and (s) are equal to 2.322 and 0.217, respectively. Afterward, the given results were saved in .xls format for analysis and in .mat format for subsequent use in the simulation step. In the simulation step, we assumed a pessimistic vision that \(75\,\%\) of the basic employments (750 of the 1000 basic employments in the zone) of Lajes Field were cut in order to resemble a future drawdown. The model was run with this impact, and afterward, the differences between the bid-rents in each scenario were used to calculate the bid-rents variation with the Lajes Field drawdown.

Fig. 2
figure 2

Terceira Island spatial distribution by parishes and with Lajes Field location

2.5 Hedonic price analysis

The .xls outputs were pre-processed for an appropriate statistical analysis with the statistical package PASW Statistics 22 (SPSS, Inc, Chicago, IL). Dummy variables were coded for each of the house typologies to subsequently perform a stepwise linear regression, with the natural logarithm of the house prices being the dependent variable, while the dummy variables for house typology and the calibrated model bid-rents were the independent variables. A p value of .05 was used for all the performed analysis. Table 1 shows the used variables in the regressions.

Table 1 Used variables in the linear regression

The economic impact in single and total house pricings were estimated by multiplying the bid-rents unstandardized coefficients with the difference in bid-rents between the calibration and simulation phases. The total economic impact in the housings for each zone was estimated by multiplying the given result for each zone by the number of houses on that zone.

3 Results and discussion

After the simulation of the Lajes Field drawdown, the model predicts that a job reduction and outmigration might occur. With the reduction of 750 basic employments in Lajes Field, there is an additional decrease of 802 non-basic employments in Terceira Island and a total decrease of 1502 in terms of employment, plus a reduction of 3502 residents that depend on that lost employment. These numbers are worrying and also represent a possible change in the spatial dynamics of the considered region. Table 2 displays the bid-rents for each scenario in each zone, and their respective difference. The first run (Scenario 1) is equivalent to the current scenario in Terceira Island, while the second run was related to a scenario regarding a future activity reduction in Lajes Field by the USA (Scenario 2). Notice that the model is calibrated assuming there is equilibrium between supply and demand for space for both scenario 1 and 2. In fact the simulation of the shock is done assuming that most of the space is fulfilled will a lower rent; only if the bid-rent becomes zero for some zones, the available space would not be fulfilled in those zones.

Table 2 Calibrated bid-rent for Scenario 1 and Scenario 2

Results ultimately reveal that almost all zones of Terceira Island will have a negative rent impact due to the Lajes Field American drawdown, indicating beforehand that Terceira Island will be gravely affected not only by the reduction in employment and population but also by the reduction in real estate values. Only three parishes (Fonte do Bastardo, Fontinhas and São Brás) have a positive variation, even though it is quite small when compared to the amplitude of the other negative variations in the island, indicating the systemic nature of the impacts included in the spatial interaction model.

A linear regression was performed with the natural logarithm of the house pricings as a dependent variable and with house typology dummy variables and alternative calibrated bid-rents as independent variables. The starting independent variable was the calibrated model bid-rents and the T1 dummy variable was omitted to avoid redundancy. The following Table 3 shows the results of the different regression regarding the unstandardized beta \(( {u}\upbeta )\) and significance for each of the used variables (t values).

Table 3 Hedonic regression results

The regression reveals that there is consistency in the unstandardized coefficients and significance values for most of the variables. However, in the last model, we verify that the T2 variable is not significant because its value cannot be differentiated from the T1 typology. Therefore, we repeated the regression without the T2 variable and we yielded our final model (6).

In this final regression, all the considered variables are significant for \(\hbox { p} < .05\) and have a coefficient of determination \((\hbox { R}^{2})\) of 0.506. The adjusted \(\hbox { R}^{2}\) is 0.478. This means that the regression model is able to explain at least \(48\,\%\) of the variance of the data.

Unstandardized coefficients of the given constant value and bid-rent variable are 10.174 and 0.171, respectively. Therefore, and assuming dummy variables with \(\hbox {x}_{1:5}\) = 0, we have the following equation that determines the price of a T1/T2 house with the bid-rent as the independent variable:

$$\begin{aligned} \hbox {Price}\left( \hbox { k} \right) =\mathrm{e}^{10,174+0.171.B_{k} } \end{aligned}$$
(18)

With \(\hbox { B}_{{k}}\) being the calibrated/simulated bid-rent for zone \( {k}\). Prices of houses for each zone is calculated for both scenarios through this equation. Subsequently, they are compared to assess the monetary variation in the real estate of the island due to Lajes Field drawdown by multiplying the price per house in each zone by the total number of houses on that zone and a variation was calculated. The graph in Fig. 3 shows the changes in the value of houses per zone presented according to the distance to Lajes Field for each municipality of the island.

Fig. 3
figure 3

Estimated price variation per house per zone

As expected, Lajes Field zone has the highest negative variation in terms of house prices. Besides Lajes Field, the higher negative variations per house are found in zones that are mainly residential areas close to Lajes Field, such as Lajes, Santa Cruz, Cabo da Praia and Vila Nova. Angra do Heroísmo municipality will also be affected, but at a lower scale compared with the municipality of Praia da Vitória.

An interesting phenomenon occurs in the parishes of Fontinhas, Fonte do Bastardo and São Brás. These zones are mainly residential areas in the Praia da Vitória municipality. It would be expected that they would be extremely affected by the Lajes Field drawdown as well. However, there is even a positive effect in these zones. This could happen due to systematic reactions implicit in the spatial interaction model because the attraction of nearby bigger parishes of Lajes and Santa Cruz decreases and those zones become relatively more central.

Next, we multiply the house price variation by the number of houses in each zone to estimate the total price variation per zone. Results are expressed in Fig. 4.

Fig. 4
figure 4

Estimated total price variation per zone

Summing up all the total values for each zone, the result is equal to approximately (55 million €) which represents the total economic impact on real estate value of Terceira Island due to the American Base closure in Lajes Field. The most affected parishes are those with higher number of houses and higher levels of employment linked to the Lajes Field, such as Santa Cruz and Lajes. As in Fig. 3, there is a small positive variation of the parishes of São Brás, Fontinhas and Fonte do Bastardo but, in overall, the whole island will be negatively affected by the Lajes Field drawdown, especially the municipality of Praia da Vitória.

4 Conclusion

In conclusion, it can be affirmed that the use of the bid-rents of spatial interaction model to estimate changes in the real estate value has proven its ability, through the internalization of a real scenario, in predicting the systemic impact of exogenous changes in the urban system not only in terms of employment, population and commuting and shopping flows but also in terms of property values. Nevertheless the joint SIM-HPM model is very sensitive to data regarding available space and since the evolution of available space throughout time is not considered in this static model it is important include, in future work, the dynamics of available space related to changes in property values assessed by bid-prices.