Spatial Regression-Based Model Specifications for Exogenous and Endogenous Spatial Interaction

LeSage, James P.; Fischer, Manfred M.

doi:10.1007/978-3-319-30196-9_2

James P. LeSage⁷ &
Manfred M. Fischer⁸

Part of the book series: Advances in Spatial Science ((ADVSPATIAL))

1726 Accesses
8 Citations

Abstract

Spatial interaction models represent a class of models that are used for modeling origin-destination flow data. The interest in such models is motivated by the need to understand and explain the flows of tangible entities such as persons or commodities or intangible ones such as capital, information or knowledge between regions. These models attempt to explain interaction between origin and destination regions using (i) origin-specific attributes characterizing the ability of the origins to generate flows, (ii) destination-specific characteristics representing the attractiveness of destinations, and (iii) variables that characterize the way spatial separation of origins from destinations constrains or impedes the interaction. They implicitly assume that using spatial separation variables such as distance between origin and destination regions will eradicate the spatial dependence among the sample of spatial flows.

Access provided by Autonomous University of Puebla. Download chapter PDF

Spatial Econometric OD-Flow Models

Keywords

JEL

1 Introduction

Spatial interaction models represent a class of models that are used for modeling origin-destination flow data. The interest in such models is motivated by the need to understand and explain the flows of tangible entities such as persons or commodities or intangible ones such as capital, information or knowledge between regions. These models attempt to explain interaction between origin and destination regions using (i) origin-specific attributes characterizing the ability of the origins to generate flows, (ii) destination-specific characteristics representing the attractiveness of destinations, and (iii) variables that characterize the way spatial separation of origins from destinations constrains or impedes the interaction. They implicitly assume that using spatial separation variables such as distance between origin and destination regions will eradicate the spatial dependence among the sample of spatial flows.

However, research dating back to the 1970s noted that spatial dependence or autocorrelation might be intermingled in spatial interaction model specifications. The idea was first put forth in a theoretical context by Curry (1972), with some subsequent debate in Curry et al. (1975). Griffith and Jones (1980) documented the presence of spatial dependence in conventional spatial interaction models. Despite this, most practitioners assume independence among observations and few have used spatial lags of the dependent variable or disturbances in spatial interaction models. Exceptions are Bolduc et al. (1992), and Fischer and Griffith (2008) who rely on spatial lags of the disturbances, and LeSage and Pace (2008) who use spatial lags of the dependent variable.

The focus of this chapter is on the log-normal version of the model. In this context, we consider spatial econometric specifications that can be used to accommodate two types of dependence scenarios, one involving endogenous interaction and the other exogenous interaction. These model specifications replace the conventional assumption of independence between origin-destination-flows with formal approaches that allow for two different types of spatial dependence in flow magnitudes.

Endogenous interaction reflects situations where there is reaction to feedback regarding flow magnitudes from regions neighboring origin and destination regions. This type of interaction can be modeled using specifications proposed by LeSage and Pace (2008) who use spatial lags of the dependent variable to quantify the magnitude and extent of feedback effects, hence the term endogenous interaction. For example, commuters might react to congestion in regions near the origin or destination of their commute to work by adjusting future location decisions. This would of course produce changes in observed flows over time that need to be considered in light of the steady state equilibria that would characterize future period flows across the commuting network. Another example would be for the case of international trade flows, where a tariff or other impediment to flows might evoke a long-run response that changes the structure of flows across the network of trading countries. Since we typically model flows using a cross-section of observed flow magnitudes that have occurred during some type period (say the past 5 years) to estimate our model parameters that describe responsiveness of flows to characteristics of the regions and distance between regions, time is not explicit in these cross-sectional models. However, interpretation of the model estimates can take place with respect to comparative statics reflecting changes from one equilibrium steady state to another.

Exogenous interaction represents a situation where spillovers arise from nearby (or perhaps even distant) regions, and these need to be taken into account when modeling observed variation in flows across the network of regions. In contrast to endogenous interaction, these contextual effects do not generate reaction to the spillovers, leading to a model specification that can be interpreted without considering changes in the long-run equilibrium state of the system of flows. Spillovers arising from spatial dependence on the context in which commuters operate impact observed variation in flows between regions and we can quantify these types of impacts without reference to long-run equilibrium impacts on flows across the network. As in the case of social networks (where the term contextual effects has its origins), contextual effects are modeled using spatial lags of the explanatory variables that represent characteristics of neighboring (or more generally connected) regions, but not spatial lags of the dependent variable, hence the term exogenous interaction.

2 The Log-Normal (Independent) Spatial Interaction Model

Spatial interaction models essentially assert a multiplicative relationship between observed flows (reflecting the magnitude of interaction) and characteristics of origin and destination regions, as well as measures of separation between the regions (typically distance). As is typical of statistical model relationships, observations on the dependent variable (observed flows between origin and destination dyads, labeled i and j) are assumed independent of observed flows between other dyads, say k and l (see, for example Sen and Smith 1995, and Fischer and Wang 2011). Such a relationship is shown in (2.1).

$$\displaystyle\begin{array}{rcl} Y (i,j)& =& CX(i)X(\,j)S(i,j),\ \ \ \ i,j = 1,\ldots,n{}\end{array}$$

(2.1)

where Y (i, j) denotes flows from region i to region j, and C is a constant of proportionality. X(i) and X( j) represent origin-specific and destination specific characteristics, with S(i, j) reflecting resistance or deterrence to flows between the origin and destination, typically modeled using some form of deterrence function reflecting spatial separation between locations i and j. At relatively large scales of geographical inquiry this might be the great circle distance between regions, measured in terms of the distance between their respective centroids. In other cases, it might be transportation or travel time, cost of transportation, perceived travel time or any other sensible measure such as political distance, language or cultural distance measured in terms of nominal or categorical attributes.

The exact functional form of the three terms X(i), X( j) and S(i, j) on the right hand side of (2.1) is subject to varying degrees of conjecture. There is wide agreement that the origin and destination factors are best given by power functions $X(i)^{\beta _{o}}$ and $X(\,j)^{\beta _{d}}$ where X(i) represents some appropriate variable measuring the propulsiveness of origins and X( j) attractiveness of destinations in a specific spatial interaction context. The term gravity model is sometimes used in place of spatial interaction because the relationship posits that the magnitude of flows (reflecting interaction) between dyads i and j is directly proportional to size of the regions, when X is a measure of size and the coefficients β _o, β _d take values of unity. In a statistical modeling context, these coefficients are parameters to be estimated.

The deterrence function S(i, j) also has a gravity interpretation such that interaction is inversely proportional to distance between dyads i and j. A number of alternative more flexible specifications have been proposed in the literature (see Fischer and Wang 2011), one being the power function:

$$\displaystyle\begin{array}{rcl} S(i,j)& =& [G(i,j)]^{\gamma }{}\end{array}$$

(2.2)

for an scalar (generalized) distance measure, G(i, j), and negative parameter γ (reflecting the inverse relationship), with γ treated as a parameter to be estimated. The deterrence function reflects the way in which spatial separation or distance constrains or impedes movement across space. In general, we will refer to this as distance between an origin i and destination j, using G(i, j).

LeSage and Pace (2008) use a matrix/vector representation of the log-transformed expression in (2.1) yielding the log-normal spatial interaction model, shown in (2.3)

$$\displaystyle\begin{array}{rcl} y& =& \alpha \iota _{n^{2}} + X_{o}\beta _{o} + X_{d}\beta _{d} + g\gamma +\varepsilon {}\end{array}$$

(2.3)

which more closely resembles a conventional regression relationship. In (2.3), y is an n ² × 1 vector of (logged) flows constructed by stacking columns of the observed n × n flow matrix Y, where we assume destination-centric organization throughout this chapter. This means that the i, jth element of the flow matrix represents a flow from region i to j. Similarly, applying the log transformation to the n × n matrix of distances G between the n destination and origin regions and stacking the columns results in an n ² × 1 vector g of (logged) distances, with associated coefficient γ. The term ɛ represents an n ² × 1 vector of constant variance, independent identically distributed normal disturbances. LeSage and Pace (2008) show that

$$\displaystyle\begin{array}{rcl} X_{o}& =& \iota _{n} \otimes X{}\end{array}$$

(2.4)

$$\displaystyle\begin{array}{rcl} X_{d}& =& X \otimes \iota _{n}{}\end{array}$$

(2.5)

for the case of a destination-centric organization, where X is an n × R matrix of characteristics for the n regions, ⊗ denotes the Kronecker product and ι _n is an n × 1 vector of ones. We note that this represents a general case where the same set of R explanatory variables is used for both origins and destinations. Thomas-Agnan and LeSage (2014) point out this may be preferred to a specification where different (subsets of the R) explanatory variables are used for origin and destinations, since exclusion of important explanatory variables may result in omitted variable bias. The scalar parameter γ reflects the effect of the vector of logged (generalized) distances g on flows which is—given the power function specification in (2.2)—thought to be negative. The parameter α denotes the intercept term.

The Kronecker product repeats the same values of the n regions in a strategic way to create a matrix of characteristics associated with each origin (destination) region, hence the use of the notation $X_{o} =\iota _{n} \otimes X,X_{d} = X \otimes \iota _{n}$ to represent these explanatory variables. Recognizing this has important implications for how we interpret estimates of the parameter vectors β _o, β _d from these models. The literature has interpreted β _o as reflecting a typical regression partial derivative ∂ y∕∂ X _o, showing how changes in origin region characteristics impact flows (on average across the sample of n ² dyads as is typical of regression estimates). Of course, this suggests we can change characteristics of origin regions while holding those of destination regions constant, since partial derivatives reflect a ceteris paribus change in X _o. It should be clear that a change in the rth characteristic of a single region i, X _i ^r, will produce changes in both X _o, X _d, since by definition $\Delta X_{o}^{r} =\iota _{n} \otimes (X^{r} + \Delta X_{i}^{r})$, and $\Delta X_{d}^{r} = (X^{r} + \Delta X_{i}^{r}) \otimes \iota _{n}$.

Intuitively, changes to the rth characteristic of a single region i will impact both inflows and outflows to all other regions engaged or connected with region i as either an origin or destination. For example, a (ceteris paribus) increase in employment in region i would lead to inflows of commuters to this region (when viewed as a destination) from (potentially) all other (origin) regions and a decrease in outflows of commuters (when viewed as an origin) to (potentially) all other (destination) regions. We will have more to say about this later.

There are some limitations to treating the spatial interaction relationship as a regression relationship between the vector of n ² × 1 logged flows and log-transformed explanatory variables. Regression relationships require the assumption of constant variance independent normally distributed disturbances in order to rely on conventional t-statistics for statistical inference regarding significance of the explanatory variables. Normal disturbances imply normally distributed flows, which is often not the case. Some flows reflect counts of migrants moving between regions, and many flow matrices contain a large proportion of dyads reflecting zero flows. These raise issues regarding the appropriate method for estimating regression-based specifications of spatial interaction model relationships, but do not have an impact on issues we will discuss pertaining to endogenous versus exogenous interaction specifications, or interpretation of estimates from these relationships.

3 Exogenous Versus Endogenous Spatial Interaction Specifications

We set forth spatial regression-based specifications for exogenous and endogenous spatial interaction models, with a focus on interpretative considerations pertaining to estimates from these two types of models.

3.1 An Endogenous Spatial Interaction Specification

As noted, this type of specification allows for flows from regions neighboring the origin region i or destination region j as well as flows between regions neighboring the origin and neighboring the destination, to exert an impact on the magnitude of observed flows between dyad (i, j). LeSage and Pace (2008) label dependence of flows on regions neighboring the origin i as origin-based dependence, that on flows neighboring the destination j as destination-based dependence, and that arising from flows between regions neighboring the origin and neighboring the destination as origin-destination based dependence.

The basic notion is that larger observed flows from an origin to a destination region are accompanied by (i) larger flows from regions nearby the origin to the destination region (origin-based dependence); (ii) larger flows from the origin region to regions neighboring the destination region (destination-based dependence), and (iii) larger flows from neighbors to the origin to regions that are neighbors of the destination (origin-to-destination-based dependence). This is accomplished using the specification in (2.6).

$$\displaystyle\begin{array}{rcl} Ay& =& \alpha \iota _{n^{2}} + X_{o}\beta _{o} + X_{d}\beta _{d} + g\gamma +\varepsilon {}\end{array}$$

(2.6)

$$\displaystyle\begin{array}{rcl} A& =& (I_{n^{2}} -\rho _{o}W_{o})(I_{n^{2}} -\rho _{d}W_{d}){}\end{array}$$

(2.7)

$$\displaystyle\begin{array}{rcl} & =& (I_{n^{2}} -\rho _{o}W_{o} -\rho _{d}W_{d} +\rho _{o}\rho _{d}W_{d}W_{o}){}\end{array}$$

(2.8)

$$\displaystyle\begin{array}{rcl} y& =& \rho _{o}W_{o}y +\rho _{d}W_{d}y +\rho _{w}W_{w}y +\alpha \iota _{n^{2}} + X_{o}\beta _{o} + X_{d}\beta _{d} + g\gamma +\varepsilon.{}\end{array}$$

(2.9)

Some things to note regarding this specification. The matrix product W _w = W _d W _o can be written as W ⊗ W, where W is an n × n spatial weight matrix. While matrix multiplication produces the term $\rho _{w} = -\rho _{o}\rho _{d}$, there is no need to impose the implied restriction during estimation of the model. The resulting model statement in (2.9) captures origin-based dependence with the spatial lag term W _o y, destination-based dependence with W _d y, and origin-destination-based dependence using W _w y. The associated parameters ρ _o, ρ _d, ρ _w reflect the relative strength of these three different dependencies.

This specification posits a simultaneous or endogenous response relationship between the variation in the dependent variable reflecting flows between all dyads (y) and flows between other regions (specifically W _o y, W _d y, W _w y) within the observed network of interregional and intraregional flows.^{Footnote 1} This has implications for how we interpret the coefficient estimates from this type of specification, with details set forth in LeSage and Thomas-Agnan (2014). It also has implications for how we must estimate the parameters β _o, β _d, γ, ρ _o, ρ _d, ρ _w, with details provided in LeSage and Pace (2008, 2009, Chapter 8). We will discuss interpretation issues in a later section. This discussion takes the parameter estimates as given, and presumes these reflect valid estimates produced using appropriate methods (either maximum likelihood or Bayesian Markov Chain Monte Carlo procedures).

3.1.1 A Theoretical Motivation for Endogenous Interaction

A criticism that might be leveled at the endogenous interaction specification in (2.9) is that this appears to arise from mere matrix algebra manipulations, rather than economic theory. We present a theoretical motivation taken from LeSage and Thomas-Agnan (2014) based on the notion that location decisions of commuters are influenced by behavior of other commuters in previous periods.

They argue that commuting residents might be influenced by nearby flows (congestion) resulting from past location decisions of other residents in neighboring regions. It might also be the case that firms are influenced by congestion arising from location decisions of nearby firms in the past.

They formally express this type of dyadic O-D flow dependence of y _t at time t on past flows y _t−1 as:

$$\displaystyle\begin{array}{rcl} y_{t}& =& My_{t-1} + Z\delta +\varepsilon _{t} \\ M& =& (\rho _{d}W_{d} +\rho _{o}W_{o} +\rho _{w}W_{w}) \\ Z& =& \left (\begin{array}{ccc} X_{d}&X_{o}&g \end{array} \right ) \\ \delta & =& \left (\begin{array}{ccc} \beta _{d}&\beta _{o}&\gamma \end{array} \right )' \\ \varepsilon _{t}& \sim & N(0,\sigma ^{2}I_{ n}) {}\end{array}$$

(2.10)

where underlying characteristics of the regions X remain relatively fixed over time, allowing us to write Z without a time subscript. Since the characteristics of regions in flow models often represent size of regions, this assumption seems (approximately) valid.

Expression (2.10) indicates that (commuting-to-work) flows between O-D dyads at time t depend on past period flows observed by residents and firms in regions neighboring their origin (W _o y _t−1) and destination regions (W _d y _t−1), as well as flows between regions neighboring the origin to regions neighboring the destination (W _w y _t−1). This is close to the endogenous interaction specification from (2.9), but relies on a time lag or past period flows, not current period as we have in our cross-sectional model.

LeSage and Thomas-Agnan (2014) show that we can interpret the endogenous spatial interaction model as the outcome or expectation of a long-run equilibrium or steady state relationship as time q → ∞, which is shown in (2.11).

$$\displaystyle\begin{array}{rcl} \mbox{ lim}_{q\rightarrow \infty }E(y_{t+q})& =& (I_{n^{2}} - M)^{-1}Z\delta \\ & =& (I_{n^{2}} -\rho _{d}W_{d} -\rho _{o}W_{o} -\rho _{w}W_{w})^{-1}Z\delta.{}\end{array}$$

(2.11)

Of course, this is the expectation for the data generating process of the spatial autoregressive interaction model given in (2.9).

From a theoretical perspective, changes in transportation infrastructure (improvements in the road network) that connects commuters between regions would be expected to result in endogenous interaction of the type captured by this model specification. We would expect to see diffusion of changes in commuting flows taking place over space, that impact flows in neighboring regions with faster commuting times, regions that neighbor these regions, and so on. These global spillover impacts are what characterize endogenous interaction, and they presumably lead to a new long-run steady state equilibrium in residents’ choices regarding routes used and firms’ choices about location.

In general shared resources are often thought to be the basis for global spillover impacts and the associated diffusion of these impacts to neighbors, neighbors to neighbors, and so on. The highway network that passes through many regions would represent one type of resource shared by many regions. Changes taking place on one segment of the highway would have (potentially) far reaching global spillover impacts.

3.2 An Exogenous Spatial Interaction Specification

There are other modeling situations where endogenous interaction is not a likely phenomenon, but spatial spillover impacts such as congestion in neighboring regions is of interest when modeling variation in flows across the network of regions. Theoretical aspects of the modeling circumstance would provide one approach to distinguishing which type of specification is most appropriate for any given application.

Exogenous interaction specifications are characterized by spatial lags of the exogenous variables X _o, X _d, leading to a model

$$\displaystyle\begin{array}{rcl} y& =& \alpha \iota _{n^{2}} + X_{o}\beta _{o} + X_{d}\beta _{d} + g\gamma + W_{o}X_{o}\theta _{o} + W_{d}X_{d}\theta _{d} +\varepsilon.{}\end{array}$$

(2.12)

It should be clear that no endogenous relationship between flows (y) and flows from neighboring regions exists in this specification. Instead, we have a situation where changes in characteristics of regions neighboring the origin (W _o X _o) and regions neighboring the destination (W _d X _d) help explain variation in flows across dyads.

A change in characteristics of neighboring regions, for example, an increase in the number of retired persons (non-commuters) locating in regions that neighbor commuting residents located at origin i (W _o X _o) might influence the magnitude of flows between dyads (i, j). Similarly, retirees locating in regions that neighbor commuters’ destination regions j (W _d X _d) might influence the magnitude of flows between dyads (i, j).

A distinction between this specification and the endogenous specification is that the focus here is on the local spillover impacts on flows arising from changes in characteristics of regions neighboring the origin or destination region. There is no implication that flows respond to feedback impacts associated with the increased number of retirees locating in regions neighboring the origin or destination, just spatial spillover impacts on the pattern of flows between origin and destination dyads due to changes in the characteristics of (say immediately) neighboring regions. Global spillovers imply diffusion over space, whereas local spillovers do not imply diffusion over space that impacts neighbors, neighbors to neighbors, and so on. Reduced congestion arising from retired persons (non-commuters) locating in a specific region would likely impact commuters from neighboring regions, but the impact would not extend to more distant neighboring regions. The number of non-commuters located in any one region does not reflect a shared resource, and would be expected to have only local spillover impacts.

Another consideration useful in distinguishing between these two types of specifications would be permanent and predictable events versus temporary or unpredictable events. For example, unpredictable events such as traffic delays due to construction or accidents in neighboring regions would not be expected to produce endogenous interaction effects because of the unpredictability of such events. It should be noted that congestion effects arising from unpredictable events such as these may create local spillover congestion that spans many regions, so we should not think of local spillovers as impacting only nearby/neighboring regions. We still label these local spillover effects because they are not associated with endogenous interaction or feedback effects whereby commuters adjust their travel routes. However, consistently higher accident rates in a group of regions might allow commuters to predict traffic delays resulting in endogenous reactions such that commuters change their routes to avoid such regions. Observed adjustments in travel routes by many commuters with widely varying origins and destinations would of course appear as a global spillover effect having impacts on regions neighboring the construction or accident zone, neighbors to these regions, neighbors to the neighbors of these regions, and so on. This of course would be reflected in the new long-run steady state equilibrium commuting flows.

4 Interpreting Estimates From Spatial Interaction Specifications

In Sect. 2.4.1 we consider how changes in the characteristics of regions impact flows in the case of the conventional (non-spatial) interaction model from (2.3). This discussion draws heavily on ideas set forth by LeSage and Thomas-Agnan (2014). They point out that changes in the rth characteristic of region i, $\Delta X_{i}^{r}$, will produce changes in flows into region i from (potentially) (n − 1) other regions, as well as flows out of region i to (potentially) (n − 1) other regions. This can be seen by noting that the matrices X _d = ι _n ⊗ X and X _o = X ⊗ι _n repeat X _i ^r n times. Unlike the situation in conventional regression models where a change $\Delta X_{i}^{r}$ leads to changes in only observation i of the dependent variable, y _i, we cannot change single elements of X _d ^r, X _o ^r, nor should we interpret the coefficient estimate $\hat{\beta }_{o},\hat{\beta }_{d}$ as reflecting the impact of this change (averaged over all observations) on a single element of the dependent variable vector y.

The fact that changes in characteristics of a single region give rise to numerous responses in the flow matrix rather than changes in a single observation (dyad) of the dependent variable (as in traditional regression) creates a challenge for drawing inferences about the partial derivative impacts of changing regional characteristics on flows. To address this challenge, Sect. 2.4.2 proposes scalar summary measures for the impact of changing regional characteristics on flows, that collapse the many changes in flows to a single number. These scalars average over the many changes that arise in the flow matrix from changing characteristics of the regions, as is typical of the way in which we interpret regression models.

In Sect. 2.4.2 we describe how LeSage and Thomas-Agnan (2014) extend the scalar summary approach to the case of an endogenous spatial interaction specification from (2.9). In this model specification, changes in the characteristics of a single region i can impact flows into and out of region i to its 2(n − 1) dyad (i, j) partners (as described above), but also flows into and out of regions that neighbor the origin i and destination j regions that are not part of the dyad (i, j). This arises from the spatial dependence part of the spatial autoregressive interaction model. An implication is that we should not interpret the coefficient estimates β _d, β _o as if they were regression estimates that reflect partial derivative changes in the dependent variable associated with changes in the explanatory variables.

Section 2.4.3 adopts the scalar summary approach to the case of the exogenous spatial interaction specification introduced here, which reflects new ideas not previously considered in the literature. However, we show that interpretation of estimates from these models reflects a special case of the scalar summary approach set forth by LeSage and Thomas-Agnan (2014).

4.1 Interpreting Estimates from Non-Spatial Interaction Specifications

Before proceeding to interpretation of the model estimates, we adopt an approach suggested by LeSage and Pace (2009, p. 223) that introduces a separate model for within region (intraregional) flows, which tend to have large values relative to between region flows. This is done by creating an intercept for flows associated with the main diagonal of the flow matrix (intraregional flows) that we label $\tilde{\alpha }$, as well as a set of explanatory variables for these flows that we label X _i. The explanatory variables X _d, X _o are adjusted to have zero values for main diagonal elements of the flow matrix and the new variables matrix X _i has associated coefficients that we label β _i. This set of explanatory variables will capture variation in intraregional flows. An adjusted version of (2.3) is shown in (2.13) reflecting these modifications to the model, where vec is the operator that converts a matrix to a vector by stacking its columns.

$$\displaystyle\begin{array}{rcl} y& =& \alpha \tilde{\iota }_{n^{2}} +\tilde{\alpha } \mbox{ vec}(I_{n}) + \tilde{X} _{o}\beta _{o} + \tilde{X} _{d}\beta _{d} + X_{i}\beta _{i} + g\gamma +\varepsilon {}\end{array}$$

(2.13)

We use $\tilde{\iota }_{n^{2}} =\iota _{n^{2}} -\mbox{ vec}(I_{n}), \tilde{X} _{o} = X_{o} - X_{i}, \tilde{X} _{d} = X_{d} - X_{i}$ to reflect the adjustment made to the original intercept and explanatory variables matrices by setting these elements to zero. The matrix X _i contains non-zero values only for dyads where the origin equals the destination (i.e., intraregional flows).

We also consider the simplest possible spatial configuration of the regions, which positions these in a straight line, with a single neighbor to the left and right.^{Footnote 2} For simplicity, we work with a single vector of explanatory variables in the following to avoid having to designate working with a specific explanatory variable. A scalar change in the characteristics of the third region ($\Delta X_{3}$) will produce an n × n matrix of changes in flows ($\Delta Y$), shown in (2.14).

$$\displaystyle{ \Delta Y/\Delta X_{3} = \left (\begin{array}{cccccccc} 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0 \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0\\ \beta _{ d} & \beta _{d} & \beta _{i} & \beta _{d} & \beta _{d} & \beta _{d} & \beta _{d} & \beta _{d} \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0 \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0 \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0 \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0 \\ 0.0&0.0&\beta _{o}&0.0&0.0&0.0&0.0&0.0\\ \end{array} \right ) }$$

(2.14)

The role of the independence assumption is clear in (2.14), where we see from column 3 that the change of outflows from region 3 to all other regions equals β _o, and similarly, row 3 exhibits changes in inflows to region 3, taking the value of the coefficient β _d. The diagonal (3,3) element reflects a response equal to β _i, which reflects the change in intraregional flows arising from the change in X ₃. We have only 2(n − 1) non-zero changes in flows by virtue of the independence assumption. All changes involving flows in- and out-of regions other than those in the dyads involving region 3 are zero.

This result suggests that for the conventional gravity model, interpreting β _o as the partial derivative impact on flows arising from changes in origin-specific characteristics (X _o) is not too bad, since the only exception is the coefficient β _i in the (3,3) element. The partial derivative for changes in the ith observation (i ≠ 3) would of course look similar to the matrix in (2.14), so averaging over changes to all observations would produce an approximately correct result when interpreting β _o, β _d as if they were simply regression coefficients. However, we will see that this reasoning does not apply to the spatial variants of the interaction model specification, a point made by Thomas-Agnan and LeSage (2014).

The approach taken by LeSage and Thomas-Agnan (2014) to producing scalar summary measures of the impacts arising from changes in characteristics of the regions involves averaging over the cumulative flow impacts associated with changes in all regions, i = 1, …, n. Scalar summaries are consistent with how coefficient estimates for the parameters in a conventional regression model are interpreted, and cumulating the impacts makes intuitive sense in our flow setting.

They calculate scalar summaries by expressing the partial derivatives as shown in (2.15), where the n × n matrices of changes in (logged) flows arising from changing the rth explanatory variable X _i ^r are stored in the n × n matrices Y _i.^{Footnote 3}

$$\displaystyle\begin{array}{rcl} TE& =& \left (\begin{array}{c} \partial Y _{1}/\partial X_{1}^{r} \\ \partial Y _{2}/\partial X_{2}^{r}\\ \vdots \\ \partial Y _{n}/\partial X_{n}^{r} \end{array} \right ) = \left (\begin{array}{c} \tilde{J } d_{1}\beta _{d}^{r} + \tilde{J } o_{1}\beta _{o}^{r} + J\imath _{1}\beta _{i}^{r} \\ \tilde{J } d_{2}\beta _{d}^{r} + \tilde{J } o_{2}\beta _{o}^{r} + J\imath _{2}\beta _{i}^{r}\\ \vdots \\ \tilde{J } d_{n}\beta _{d}^{r} + \tilde{J } o_{n}\beta _{o}^{r} + J\imath _{n}\beta _{i}^{r} \end{array} \right ){}\end{array}$$

(2.15)

In (2.15), Jd _i is an n × n matrix of zeros with the ith row equal to ι _n′β _d, and Jo _i is an n × n matrix of zeros with the ith column equal to ι _n β _o. The matrix J ı _i is an n × n matrix of zeros with a one in the i, i row and column position. We could express $\tilde{J } d_{i} = Jd_{i} - J\imath _{i}$, and also $\tilde{J } o_{i} = Jo_{i} - J\imath _{i}$. We have n sets of n × n outcomes, (one for each change in X _i ^r, i = 1, …, n) resulting in an n ² × n matrix of partial derivatives reflecting the total effect on flows from changing the rth characteristic of all n regions, hence the label TE.

The TE consists of origin effects OE = $\left (\tilde{J } o_{1}\beta _{o}^{r},\ldots, \tilde{J } o_{n}\beta _{o}^{r}\right )'$, destination effects DE = $\left (\tilde{J } d_{1}\beta _{d}^{r},\ldots, \tilde{J } d_{n}\beta _{d}^{r}\right )'$, and intraregional effects IE = $\left (J\imath _{1}\beta _{i}^{r},\ldots,J\imath _{n}\beta _{i}^{r}\right )'$.

The total effects on flows can be cumulated and then averaged to produce a scalar summary measure of the total impact of changes in the typical region’s rth characteristic. This takes the form: $te = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot TE \cdot \iota _{n}$, where we follow LeSage and Thomas-Agnan (2014) and use lower case te to represent the scalar summary measure of the n ² × n matrix TE. This scalar summary is consistent with the way that regression coefficient estimates are interpreted as averaging over changes in all observations of an explanatory variable. We can also produce scalar summary estimates of the origin effects ($oe = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot OE \cdot \iota _{n}$), destination effects ($de = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot DE \cdot \iota _{n}$), and intraregional effects ($ie = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot IE \cdot \iota _{n}$).

To illustrate use of these formulas, we provide a numerical illustration based on values of $\beta _{o} = -0.5,\beta _{d} = 1,\beta _{i} = 0.5$ in Table 2.1. The scalar summaries sum to the scalar summary total effect. In addition to the scalar summary effects estimates, we present the parameters β _o, β _d whose estimates are typically interpreted as origin and destination effects, and whose sum is considered the total effect arising from a change in the rth explanatory variable (Table 2.1).

Table 2.1 Scalar summary measures of effects for the non-spatial model from a change in the (single) rth characteristic X ^r averaged over all regions

Full size table

Table 2.2 Scalar summary measures of effects for the exogenous spatial interaction model from a change in the (single) rth characteristic X ^r averaged over all regions

Full size table

As Thomas-Agnan and LeSage (2014) point out, the results differ slightly from the conventional interpretation of non-spatial gravity models where the coefficient β _o is interpreted as a partial derivative reflecting the impact of changes in origin characteristics and β _d that associated with changing destination characteristics. Although the conventional approach that uses the coefficient sum β _o +β _d as a measure of the total effect on flows arising from changes in origin and destination characteristics would produce a correct inference, the appropriate decomposition into origin, destination and intraregional effects has been wrong in the historical literature.

4.2 Interpreting Estimates from Exogenous Interaction Specifications

The exogenous interaction specification extended to include an intraregional specific intercept and set of explanatory variables is shown in (2.16), where we have added origin and destination specific spatial lags of the (adjusted) explanatory variables matrices.

$$\displaystyle\begin{array}{rcl} y& =& \alpha \tilde{\iota }_{n^{2}} +\tilde{\alpha } \mbox{ vec}(I_{n}) + \tilde{X} _{o}\beta _{o} + \tilde{X} _{d}\beta _{d} + X_{i}\beta _{i} + W_{o}\tilde{X} _{o}\theta _{o} + W_{d}\tilde{X} _{d}\theta _{d} + g\gamma +\varepsilon.\ \ \ \qquad {}\end{array}$$

(2.16)

Changes in the rth explanatory variable now result in two additional terms in the partial derivatives expressions shown in (2.17). The new terms associated with the spatial lags of the explanatory variables reflect (local) spatial spillovers arising from neighbors to the origin and neighbors to the destination regions.

$$\displaystyle\begin{array}{rcl} TE& =& \left (\begin{array}{c} \partial Y _{1}/\partial X_{1}^{r} \\ \partial Y _{2}/\partial X_{2}^{r}\\ \vdots \\ \partial Y _{n}/\partial X_{n}^{r} \end{array} \right ) = \left (\begin{array}{c} \tilde{J } d_{1}\beta _{d}^{r} + \tilde{J } o_{1}\beta _{o}^{r} + J\imath _{1}\beta _{i}^{r} + \tilde{J } d_{1}\theta _{d}^{r} + \tilde{J } o_{1}\theta _{o}^{r} \\ \tilde{J } d_{2}\beta _{d}^{r} + \tilde{J } o_{2}\beta _{o}^{r} + J\imath _{2}\beta _{i}^{r} + \tilde{J } d_{2}\theta _{d}^{r} + \tilde{J } o_{2}\theta _{o}^{r}\\ \vdots \\ \tilde{J } d_{n}\beta _{d}^{r} + \tilde{J } o_{n}\beta _{o}^{r} + J\imath _{n}\beta _{i}^{r} + \tilde{J } d_{n}\theta _{d}^{r} + \tilde{J } o_{n}\theta _{o}^{r} \end{array} \right )\qquad {}\end{array}$$

(2.17)

A similar approach to decomposing the total effects can be used along with conversion of these to scalar summary estimates. In this case we have: $te = oe + de + ie + noe + nde$, where the new terms: noe and nde are labeled network origin effects and network destination effects. These are calculated using: ($noe = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot NOE \cdot \iota _{n}$), destination effects ($nde = (1/n^{2})\iota _{n^{2}}^{{\prime}}\cdot NDE \cdot \iota _{n}$), where: $NOE = \left (\tilde{J } o_{1}\theta _{o}^{r},\ldots, \tilde{J } o_{n}\theta _{o}^{r}\right )'$, and $NDE = \left (\tilde{J } d_{1}\theta _{d}^{r},\ldots, \tilde{J } d_{n}\theta _{d}^{r}\right )'$.

Intuitively, these new scalar summary measures of the origin- and destination-specific spatial spillover effects reflect the impact of changes in characteristics of regions neighboring the origin and destination on flows between the typical dyad. We extend our previous example, using $\theta _{o} = -0.25,\theta _{d} = 0.5$ in Table 2.2 to illustrate the difference between using β _o, β _d, θ _o, θ _d as if these were partial derivatives.

In contrast to the Table 2.1 non-spatial case, the total effects calculated in Table 2.2 by summing up coefficients $\beta _{o} +\beta _{d} +\theta _{o} +\theta _{d} = 0.75$ are not equal to the true total effects. We also see discrepancies between the true origin, destination, network origin and network destination effects (based on actual partial derivatives) and those from simply interpreting the coefficient estimates as if they were partial derivatives.

4.3 Interpreting Estimates from Endogenous Interaction Specifications

The endogenous interaction specification extended to include an intraregional specific intercept and set of explanatory variables is shown in (2.18), where we have added origin and destination specific spatial lags of the dependent variable to capture origin, destination and origin-destination dependence of the type proposed by LeSage and Pace (2008).

$$\displaystyle\begin{array}{rcl} y& =& \rho _{o}W_{o}y +\rho _{d}W_{d}y +\rho _{w}W_{w}y +\alpha \tilde{\iota } _{n^{2}} +\tilde{\alpha } \mbox{ vec}(I_{n}) + \tilde{X} _{o}\beta _{o} + \tilde{X} _{d}\beta _{d} \\ & & +X_{i}\beta _{i} + g\gamma +\varepsilon. {}\end{array}$$

(2.18)

Working with the expression for the data generating process of this model, LeSage and Thomas-Agnan (2014) show that the partial derivatives ∂ y∕∂ X ^r′, take the form shown in (2.19).

$$\displaystyle\begin{array}{rcl} TE& =& \left (\begin{array}{c} \partial Y _{1}/\partial X_{1}^{r} \\ \partial Y _{2}/\partial X_{2}^{r}\\ \vdots \\ \partial Y _{n}/\partial X_{n}^{r} \end{array} \right ) \\ & =& (I_{n^{2}} -\rho _{o}W_{o} -\rho _{d}W_{d} -\rho _{w}W_{w})^{-1}\left (\begin{array}{c} \tilde{J } d_{1}\beta _{d}^{r} + \tilde{J } o_{1}\beta _{o}^{r} + J\imath _{1}\beta _{i}^{r} \\ \tilde{J } d_{2}\beta _{d}^{r} + \tilde{J } o_{2}\beta _{o}^{r} + J\imath _{2}\beta _{i}^{r}\\ \vdots \\ \tilde{J } d_{n}\beta _{d}^{r} + \tilde{J } o_{n}\beta _{o}^{r} + J\imath _{n}\beta _{i}^{r} \end{array} \right ).\qquad {}\end{array}$$

(2.19)

These represent an extension of the partial derivatives from the non-spatial model, where in this endogenous spatial interaction specification, the n ² × n ² matrix inverse: $A^{-1} = (I_{n^{2}} -\rho _{o}W_{o} -\rho _{d}W_{d} -\rho _{w}W_{w})^{-1}$, pre-multiplies the non-spatial effects. (Of course, in the non-spatial case $\rho _{o} =\rho _{d} =\rho _{w} = 0$.) A similar decomposition of the total effects can be applied to produce origin effects (OE), destination effects (DE), intraregional effects (IE) and network effects (NE). The network effects reflect spatial spillovers from: neighbors to the origin, neighbors to the destination and neighbors to the origin to neighbors of the destination.

As an illustration of the nature of these partial derivatives, consider the example shown in (2.20), where we consider a change to the single observation X ₃, based on the same numerical values set forth in the previous section for $\beta _{o} = -0.5,\beta _{d} = 1.0,\beta _{i} = 0.5$, while setting $\rho _{o} = 0.5,\rho _{d} = 0.4$ and $\rho _{w} = -\rho _{o}\rho _{d} = -0.2$.^{Footnote 4}

$$\displaystyle{ \Delta Y/\Delta X_{3} = \left (\begin{array}{cccccccc} \phantom{0,}0.052 & - 0.086& - 0.777& - 0.069& \phantom{0,}0.121 & \phantom{0,}0.171 & \phantom{0,}0.185 & \phantom{0,}0.187 \\ \phantom{0,}0.337 & \phantom{0,}0.199 & - 0.492& \phantom{0,}0.216 & \phantom{0,}0.406 & \phantom{0,}0.457 & \phantom{0,}0.470 & \phantom{0,}0.473 \\ \phantom{0,}2.048 & \phantom{0,}1.910 & \phantom{0,}1.219 & \phantom{0,}1.927 & \phantom{0,}2.117 & \phantom{0,}2.168 & \phantom{0,}2.181 & \phantom{0,}2.184 \\ \phantom{0,}0.318 & \phantom{0,}0.180 & - 0.511& \phantom{0,}0.197 & \phantom{0,}0.387 & \phantom{0,}0.438 & \phantom{0,}0.451 & \phantom{0,}0.454 \\ - 0.043& - 0.181& - 0.872& - 0.164& \phantom{0,}0.026 & \phantom{0,}0.077 & \phantom{0,}0.090 & \phantom{0,}0.093 \\ - 0.118& - 0.256& - 0.947& - 0.239& - 0.050& \phantom{0,}0.001 & \phantom{0,}0.015 & \phantom{0,}0.017 \\ - 0.134& - 0.272& - 0.963& - 0.255& - 0.065& - 0.014& - 0.001& \phantom{0,}0.002 \\ - 0.136& - 0.275& - 0.965& - 0.257& - 0.068& - 0.017& - 0.004& - 0.001\\ \end{array} \right ). }$$

(2.20)

As LeSage and Thomas-Agnan (2014) point out, the spatial autoregressive specification results in the presence of network spillover effects, shown by the non-zero elements in rows and columns other than 3. This means that a change in say the attractiveness of region 3 impacts flows throughout the network. This arises because the spatial autoregressive model specification allows for global spillovers which can be viewed as diffusion throughout the network of the increased attractiveness of region 3.

Of course, the largest network spillover impacts still tend to reside in the third row and column, since the change in attractiveness of region 3 has the largest impact on flows involving region 3 in the O-D dyads. The magnitude of impact decreases as we move further from the (3,3) element, with the non-linear nature of this decay of influence determined by a number of factors. Specifically, the matrix W plays a role, as well as the spatial dependence parameters ρ _o, ρ _d, ρ _w. For this simple example, where regions are configured to lie in a line, moving to row and column elements further from the (3,3) position should reflect more distant neighbors. An implication of the increase in paths through which the flows must pass to reach the (8,8) and (1,1) dyads in the network is that smaller network effects arise in the flow matrix for these dyads.

One point is that we follow LeSage and Thomas-Agnan (2014) who calculate only a single scalar summary measure of the network effects, rather than attempt to make distinctions between origin- and destination-specific network effects. Because of the non-linearity and diffusion of effects evident in the matrix shown in (2.20), it seems prudent to focus on a single measure of spatial spillovers falling on all regions in the network. This is of course in contrast to the exogenous spatial interaction specification where it is a simple matter to produce a decomposition that separates network origin and network destination effects.

We extend our previous example, using the same values: $\beta _{o} = -0.5,\beta _{d}\,=\,1,\beta _{i} = 0.5,\rho _{o} = 0.5,\rho _{d} = 0.4,\rho _{w} = -0.2$ to produce correct partial derivatives. These are contrasted with the typical interpretation of β _o, β _d as if these were partial derivatives in Table 2.3.

Table 2.3 Scalar summary measures of effects for the endogenous spatial interaction model from a change in the (single) rth characteristic X ^r averaged over all regions

Full size table

For the case of an endogenous spatial interaction specification, we see little relationship between the coefficients β _o, β _d and the true origin and destination effects. This is similar to the case of conventional spatial regression models where practitioners have historically misinterpreted these coefficient estimates as if they represented partial derivatives (see LeSage and Pace 2008). For an application involving commuting flows between regions in Toulouse, France see LeSage and Thomas-Agnan (2014), who provide an interpretative discussion of the various effects estimates and inferences associated with the endogenous spatial interaction model. In the next section we provide an illustration of estimates and inferences for the case of the exogenous spatial interaction model that we have proposed here.

5 An Applied Illustration Involving Movement of Teachers Between School Districts

We use flows of teachers between 67 county-level school district, in Florida over the period 1995 to 2004. The flows were constructed by tracing the location of 102,327 teachers in the system during 1995, who were still in the system during 2004. We ignore teachers that left the system and those that entered during this time period. The impact of this is an issue to be addressed in future work.

One way to motivate dependence is to view the county-level school districts as representing a network system. Changes by a single school district that affect working conditions, salary or employment requirements of teachers will have an impact on the own-district as well as other nearby districts that can be viewed as nodes in the statewide network. The movement of teachers may be to and from other schools within the own-county or district or they may be between districts.^{Footnote 5}

In the empirical trade literature, Poisson pseudo-maximum likelihood estimation methods have become popular as a way of dealing with several econometric issues that arise when modeling origin-destination flows [e.g., Santos Silva and Tenreyro 2006, 2010, 2011 and Gourieroux et al. 1984]. We rely on our exogenous spatial interaction specification that allows for spatial dependence between flows from nearby regions/school districts.

One econometric issue that arises when modeling inter-district flows of teachers is that these reflect ‘count data’, or observations taking discrete values or zero magnitudes in the case where no flows between dyads reflecting districts i and j occur. This suggests a Poisson spatial interaction model is most appropriate. There are several econometric advantages to this model specification along with Poisson pseudo-maximum likelihood (PPML) estimation procedures over log-normal specifications that either delete zero flows or modify the dependent variable using ln(y + 1) to accommodate the log transformation of the multiplicative gravity model. One is that the coefficients on logged explanatory variables (X) in the (exponential) relationship involving non-logged flow magnitudes as the dependent variable (y) can be interpreted as the elasticity of the conditional expectation of y _i with respect to X _i. Since Jensen’s inequality implies that E(lny) ≠ lnE(y), heteroscedasticity in log-linear regression gravity models can lead to inconsistent elasticity estimates, which is not a problem with PPML estimates.^{Footnote 6}

In addition to dealing with heteroscedasticity, the Poisson gravity model along with PPML estimation procedures does not require taking logs of the flows, to avoid the problem of (logs) in the presence of zero flows. With regard to the zero problem, our sample of flows between 67 counties/school districts contains 1,266 non-zero flow magnitudes out of a possible 67 × 67 = 4,489 flows between the 67 districts. This reflects 28.2 % non-zeros and 71.8 % zeros. Although the prevalence of zero values has an adverse impact on the PPML estimates, Santos Silva and Tenreyro (2011) point out that the PPML model works better than alternative approaches even in the face of a large proportion of zero flow values.

This allows us to make a point that interpretative considerations discussed are based on coefficient estimates for the parameters β _o, β _d, θ _o, θ _d which should be produced using a valid estimation approach. Our derivations hold true for any valid estimates of these parameters.

Characteristics used are shown in Table 2.4, where values for these variables are for the year 1995 in an attempt to avoid a simultaneity problem. Use of the (log) number of teachers (in the origin and destination districts) as explanatory variables captures the basic notion behind gravity models where the magnitude of district interaction (in our case teacher movement) is directly proportional to the product of district size measures (in our case the (log) of the number of teachers in origin and destination school districts).^{Footnote 7} Direct proportionality would result in an elasticity coefficient for these two variables equal to one.

In addition to the origin and destination size variables, two other explanatory variables were used, one reflecting a teacher characteristic, Salary, and the other a student characteristic, Poverty. The decision to use only two other explanatory variables was for the sake of simplicity in our illustrative example.

The traditional gravity model posits that flows are inversely proportional to distance, so we would expect a coefficient of minus one on the logged distance variable. Use of the conventional log transformation of the number of teachers, salary, percentage of students in poverty and distance variables allows us to interpret these estimates as elasticities. We adopt the approach that introduces a separate model for within district flows of teachers, which tend to have large values relative to between district flows.

Table 2.4 District-level variables used in the model

Full size table

Table 2.5 Coefficient estimates from the PPML model

Full size table

As argued in the previous section, the coefficients in Table 2.5 should not be interpreted as if they represent the true effects associated with changes in the explanatory variables of the model. Table 2.6 shows the effects estimates that represent actual partial derivatives showing how flows respond (in elasticity terms on average over the sample) to changes in the number of teachers, salary of teachers and poverty status of students in origin and destination districts, the own-district and neighboring districts.

We can use the estimates in Table 2.5 to make the point that a non-spatial interaction specification would suffer from omitted variables bias due to its exclusion of the spatial lags W _o X _o, W _d X _d variables, since all but one of these variables (W _o teachers_o) are significantly different from zero at the 99 % level.

Table 2.6 Effects estimates from the exogenous spatial interaction model

Full size table

From Table 2.6, we see that larger origin and destination districts (measured by the number of teachers in these districts) leads to an increase in flows. Given that the effects estimates reflect elasticity responses of flows, they point to flows as having a slightly less than proportional relationship with size.

The intradistrict effects of size are positive and small, but statistically significant, suggesting more intradistrict flows for larger districts, which makes intuitive sense. Spatial spillovers from larger districts neighboring the origin district (network origin effects in the table) are negative, but not significant (using the 0.10 level), while larger districts neighboring the destination district are positive and significant. This suggests a competition effect associated with larger neighboring districts, that produces more inflows to destination regions from these larger neighboring districts. The response of teacher flows to district size overall (the total effect) is such that a 10 % increase in the size of the typical district would produce 14 % more flows across the entire network. This includes a small (0.15 %) significant increase in within district flows, a 0.562 % increase of inflows to destination districts districts from neighbors, as well as a 7.28 % increase in outflows from origins and a 6.41 % increase of inflows to destinations.

Teacher salaries (logged) exhibit insignificant origin and destination effects, suggesting these do not impact teacher decisions to move from one school district to another. Higher salaries have a small but significant effect on within district movement of teachers. This is not surprising given that higher salaries are positively correlated with years of experience. The seniority system gives teachers with more years of service preference in filling vacant jobs in other schools within the same district. The origin spillover effects of teacher salary are positive, suggesting a competitive effect where higher salaries in neighboring districts increase outflows from origin districts. Destination spillover effects of salary are negative, suggesting a competitive effect of districts with higher salaries that neighbor the destination decreasing inflows. Given that these effects estimates are elasticities, we can say that the positive impact of origin spillover effects are more important than the negative impact of destination spillover effects.

For the poverty variable, an increase in the (logged) proportion of students in poverty would increase outflows from the origin district, and decrease inflows to a destination district, which makes intuitive sense. The effect of poverty on within district teacher flows is small and negative, but significant. The effect of more poverty in districts neighboring the origin is negative and significant, meaning that inflows from neighboring districts would be smaller in this case. This suggests a teacher retention effect for districts surrounded by those with more students in poverty. The effect of more poverty in districts neighboring the destination is positive and significant, suggesting more inflows to destination districts having neighbors with more students in poverty. This suggests that teachers are more likely to move to a neighboring district from surrounding districts with more students in poverty, a competition effect. The retention and competition effects of poverty in neighboring school districts are reasonably large in magnitude, taking values nearly half the magnitude of origin and destination effects for the poverty variable. It is interesting that the total effect of students in poverty is not significantly different from zero. This suggests that the retention and competition effects are offsetting. One way to view this would be that teachers are needed to fill posts in all schools including those with high proportions of students in poverty.

6 Conclusion

We reiterate the point made by LeSage and Thomas-Agnan (2014) and Thomas-Agnan and LeSage (2014) that the structure of explanatory variables used in non-spatial and spatial interaction models is such that we cannot interpret coefficients associated with origin explanatory variables (that we label X _o here) and coefficients from destination explanatory variables (that we label X _d here) as reflecting typical regression partial derivatives ∂ y∕∂ X _o and ∂ y∕∂ X _d, showing how changes in origin (destination) region characteristics impact flows (on average across the sample of n ² dyads as is typical of regression estimates). This is because we cannot change characteristics of origin (destination) regions while holding those of destination (origin) regions constant, which is typical of how partial derivatives are viewed. It should be clear that a change in the rth characteristic of a single region i, X _i ^r, will produce changes in both X _o, X _d, since by definition $\Delta X_{o}^{r} =\iota _{n} \otimes (X^{r} + \Delta X_{i}^{r})$, and $\Delta X_{d}^{r} = (X^{r} + \Delta X_{i}^{r}) \otimes \iota _{n}$.

We provide a discussion of exogenous and endogenous spatial interaction model specifications that are each suited to differing applied situations. The argument advanced is that an exogenous specification is most appropriate when characteristics of neighboring regions exert an influence on variation in flows between dyads, but do not produce feedback effects producing changes in the long-run steady state equilibrium of the network of flows. Examples include situations involving temporary or unpredictable events that do not evoke endogenous interaction because of the unpredictability of changes taking place in neighboring regions. In contrast, endogenous interaction specifications are more appropriate for situations where predictable or permanent changes take place in the network structure such that economic agents react to these changes by changing decisions regarding routes of movement for people, commodities, etc.

In addition to setting forth expressions for the true partial derivatives of non-spatial and endogenous spatial interaction models and associated scalar summary measures from LeSage and Thomas-Agnan (2014), we propose new scalar summary measures for the exogenous spatial interaction specification introduced here. An illustration applies the exogenous spatial interaction model to a flow matrix of teacher movements between 67 school districts in the state of Florida.

Notes

1.
Intraregional flows are recorded on the main diagonal of the flow matrix.
2.
The west most region at the beginning of the line of regions has a single neighbor to the right, and the east most region at the end of the line has a single neighbor to the left.
3.
Our expressions differ slightly from those of LeSage and Thomas-Agnan (2014) because of our modification of the model specification to incorporate X _i variables to model intraregional variation in flows.
4.
This example is identical to Thomas-Agnan and LeSage (2014).
5.
Florida has county-level districts so that districts and counties coincide in our analysis.
6.
Santos Silva and Tenreyro (2006) note there is strong evidence that disturbances from log-linear gravity models are heteroscedastic.
7.
In the case of interregional commodity flows, the measure of regional size is typically gross regional product or regional income. The model predicts more interaction in the form of commodity flows between regions of similar (economic) size than regions dissimilar in size. For the case of migration flows, population would be a logical measure of regional size, and in other contexts such as ours involving teacher flows between school districts, use of the number of teachers in each district seems a reasonable measure of district size.

References

Bolduc D, Laferriere R, Santarossa G (1992) Spatial autoregressive error components in travel flow models. Reg Sci Urban Econ 22(3):371–385
Article Google Scholar
Curry L (1972) A spatial analysis of gravity flows. Reg Stud 6(2):131–147
Article Google Scholar
Curry L, Griffith DA, Sheppard ES (1975) Those gravity parameters again. Reg Stud 9(3):289–296
Article Google Scholar
Fischer MM, Griffith DA (2008) Modeling spatial autocorrelation in spatial interaction data: an application to patent citation data in the European Union. J Reg Sci 48(5):969–989
Article Google Scholar
Fischer MM, Wang J (2011) Spatial data analysis: models, methods and techniques. Springer, Heidelberg
Book Google Scholar
Gourieroux C, Monfort A, Trognon A (1984) Pseudo maximum likelihood methods: applications to Poisson models. Econometrica 52(3):701–720
Article Google Scholar
Griffith DA, Jones KG (1980) Explorations into the relationship between spatial structure and spatial interaction. Environ Plan A 12:187–201
Article Google Scholar
LeSage JP, Pace RK (2008) Spatial econometric modeling of origin-destination flows. J Reg Sci 48(5):941–967
Article Google Scholar
LeSage JP, Pace RK (2009) Introduction to spatial econometrics. Taylor & Francis/CRC Press, Boca Raton, FL
Book Google Scholar
LeSage JP, Thomas-Agnan C (2014) Interpreting spatial econometric origin-destination flow models. J Reg Sci 55(2):188–208
Article Google Scholar
Santos Silva JMC, Tenreyro S (2006) The log of gravity. Rev Econ Stat 88(4):641–658
Article Google Scholar
Santos Silva JMC, Tenreyro S (2010) On the existence of the maximum likelihood estimates for Poisson regression. Econ Lett 107(2):310–312
Article Google Scholar
Santos Silva JMC, Tenreyro S (2011) Further simulation evidence on the performance of the Poisson-PML estimator. Econ Lett 112(2):220–222
Article Google Scholar
Sen AK, Smith TE (1995) Gravity models of spatial interaction behavior. Springer, Heidelberg
Book Google Scholar
Thomas-Agnan C, LeSage JP (2014) Spatial econometric OD-flow models. In: Fischer MM, Nijkamp P (eds) Handbook of regional science. Springer, Berlin, pp 1653–1673
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Finance and Economics, McCoy College of Business Administration, Texas State University, San Marcos, TX, USA
James P. LeSage
Institute for Economic Geography and GIScience, Vienna University of Economics and Business, Vienna, Austria
Manfred M. Fischer

Authors

James P. LeSage
View author publications
You can also search for this author in PubMed Google Scholar
Manfred M. Fischer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James P. LeSage .

Editor information

Editors and Affiliations

Department of Economics, University of Bologna, Rimini, Italy
Roberto Patuelli
Università Cattolica del Sacro Cuore, Rome, Italy
Giuseppe Arbia

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

LeSage, J.P., Fischer, M.M. (2016). Spatial Regression-Based Model Specifications for Exogenous and Endogenous Spatial Interaction. In: Patuelli, R., Arbia, G. (eds) Spatial Econometric Interaction Modelling. Advances in Spatial Science. Springer, Cham. https://doi.org/10.1007/978-3-319-30196-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-30196-9_2
Published: 23 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30194-5
Online ISBN: 978-3-319-30196-9
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics

Spatial Regression-Based Model Specifications for Exogenous and Endogenous Spatial Interaction

Abstract

Similar content being viewed by others

Spatial Econometric OD-Flow Models

Spatial Econometric OD-Flow Models

Spatial Econometric OD-Flow Models

Keywords

JEL

1 Introduction

2 The Log-Normal (Independent) Spatial Interaction Model

3 Exogenous Versus Endogenous Spatial Interaction Specifications

3.1 An Endogenous Spatial Interaction Specification

3.1.1 A Theoretical Motivation for Endogenous Interaction

3.2 An Exogenous Spatial Interaction Specification

4 Interpreting Estimates From Spatial Interaction Specifications

4.1 Interpreting Estimates from Non-Spatial Interaction Specifications

4.2 Interpreting Estimates from Exogenous Interaction Specifications

4.3 Interpreting Estimates from Endogenous Interaction Specifications

5 An Applied Illustration Involving Movement of Teachers Between School Districts

6 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Spatial Regression-Based Model Specifications for Exogenous and Endogenous Spatial Interaction

Abstract

Similar content being viewed by others

Spatial Econometric OD-Flow Models

Spatial Econometric OD-Flow Models

Spatial Econometric OD-Flow Models

Keywords

JEL

1 Introduction

2 The Log-Normal (Independent) Spatial Interaction Model

3 Exogenous Versus Endogenous Spatial Interaction Specifications

3.1 An Endogenous Spatial Interaction Specification

3.1.1 A Theoretical Motivation for Endogenous Interaction

3.2 An Exogenous Spatial Interaction Specification

4 Interpreting Estimates From Spatial Interaction Specifications

4.1 Interpreting Estimates from Non-Spatial Interaction Specifications

4.2 Interpreting Estimates from Exogenous Interaction Specifications

4.3 Interpreting Estimates from Endogenous Interaction Specifications

5 An Applied Illustration Involving Movement of Teachers Between School Districts

6 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation