Introduction

Transport infrastructure such as modern rail transit systems (heavy and light rail transit systems) are believed to have improved accessibility from residential areas to Central Business Districts (CBD)—where employment and economic activities often concentrate. The classical urban land economics theories proposed by Alonso (1964), Muth (1969) and Mills (1972) indicate transport cost is an important determinant of land value—with improved accessibility to the CBD, the land value increases as a result of the decreasing transport cost. Emphasis on land value has been taken into account within research on locational externalities generated by rail transit systems, which in turn affect the residential and commercial land.

In the case of the effects of rail transit systems on residential and commercial land, it is expected that the existence of a rail transit system should be able to capitalise land values in the form of property values (residential and commercial property). Banister and Berechman (2001) argue that the improvements in accessibility for those areas that are served by the rail transit systems can potentially trigger several major positive locational externalities, in particular for properties located within close proximity to railway stations. They argue further that these positive locational externalities should be viewed as additional benefits to the primary benefits of accessibility improvement.

The evidence from empirical research investigating the effect of rail transit systems on property values in Europe and North America suggest that the ex ante outcomes of rail transit are often difficult to predict, where the property value premiums vary between studies. Whilst premiums are reported there is often a lack of consideration given to why these findings vary significantly between studies. Indeed, previous studies ‘do not provide a firm basis to judge future impacts’ (Hess and Almeida 2007: p. 1042). An important question to ask is upon what local factors does this variation in premium depend? Through exploring how and why premiums vary across space insights can be gained into the appropriate design of future rail transit systems.

The purpose of this study therefore is to estimate the spatial variation of the effect of proximity to the light rail transit (LRT) stations on residential property values. Through a LRT system within Greater Kuala LumpurFootnote 1 in Malaysia, this study specifically attempts to explore the extent of spatial variation in value premiums and the factors upon which this variation depends. To map this spatial variation, a technique known as geographically weighted regression (GWR) is used. By employing GWR, it allows local rather than global parameters to be estimated, and thus provides a way of accommodating the local geography of residential property values-LRT system relationships.

Knowledge about the effect of rail transit on property values has important implications for the potential implementation of Land Value Capture (LVC); a technique designed to capture the property value premiums created by the provision of public services which can provide a funding mechanism for new transport infrastructure. A rail transit system can be seen as an investment with financial returns through increased land values. These values could potentially be recouped through for example increased future tax revenue from enhanced economic activity following the infrastructure investment, joint development where the public sector internalises the benefits of infrastructure improvement through land sales, or a specific betterment tax on developers/property owners (Medda 2012). This LVC approach has been widely implemented internationally and has become increasingly relevant in the context of governments having severe budgetary constraints (Smith et al. 2006, 2009; van der Krabben and Needham 2008; Medda 2012). To explore the ideas behind land value capture, it is important to understand the relationships between transport accessibility and land value, particularly the factors which determine positive land value uplift. Such value has usually been estimated from generic time series property models which are unable to estimate how positive externalities vary with distance from the stations and their sensitivity to local factors (other than station quality) (van der Krabben and Needham 2008; Medda 2012). Given that the fairness of taxation measures and/or government financial returns on investment are likely to depend on such spatial variation, it is crucial that it is included within property valuation models. By developing an understanding of the likely contextual factors affecting land value premiums from LRT, different transport schemes for the same route may be judged to have different land value potential, which may, in turn, help determine which transport scheme to pursue.

The paper is organised as follows. The next section reviews the existing literature and outlines the relevant lessons learned from these previous studies. The estimation methods are then considered in which the use of the GWR technique is justified and explained. The study area is then introduced and data acquisition described. The results of the estimation are then presented and discussed. The paper concludes with a review of the implications of the findings for a LVC policy.

Existing Research and Implication for This Study

Over the past 40 years, a considerable body of research has emerged on the effects of transportation investment on property values. Throughout the 1960s, considerable attention was focused on the comparatively broad issue of how transportation investment influences urban form and consequently, urban property values. A driving force of this research was the notion that any significant improvement in the transportation system that increases accessibility and reduces transportation costs, usually assessed in terms of the CBD, should be capitalised in property values (Alonso 1964; Muth 1969; Mills 1972). Whilst this paper focuses on public transport systems, in particular LRTs, it is important to recognise road improvements are also likely to bring premiums in property values (Boarnet and Chalermpong 2001).

Considering previous reviews of the effect of public transport on property values (Diaz 1999; Ryan 1999; Cervero et al. 2002; and Smith et al. 2006, 2009) the type of public transport was found to be quite influential on value premiums. For example, the literature has shown that properties served by heavy rail produce greater effects than property served by light rail. This was expected to be due to faster speeds, frequent trains and greater geographical coverage by heavy rail. In the case of commuter rail or also known as suburban rail, results from previous studies have shown that properties located near to commuter rail stations receive greater premiums than heavy or light rail, particularly when a commuter rail station is at the centre of, or within walking distance of, a commercial core or main street (see for example, Armstrong 1994; Cervero and Duncan 2002). In the case of the effects of bus routes on property values, several studies (see for example, Goodwin and Lewis 1997; Barker 1998; Rodriguez and Targa 2004) have shown that property values near bus routes have only modest uplifts from transit proximity. Hess and Almeida (2007) attribute this modest effect to the lack of permanence of fixed infrastructure provided by bus routes.

Previous reviews also suggest there to be much inconsistency across studies and the estimated value premiums vary considerably in magnitude. Such cross-study variation may be due to the complexity of metropolitan development, unpredictable travel patterns, extent of the accessibility improvement, the relative attractiveness of the locations near station areas and the real estate markets in the region (Ryan 1999; Smith et al. 2006, 2009). Yet these differences have been rarely modelled across space within a single study.

As this paper focuses on rail systems, Table 1 provides a summary of the effect of such systems on property values since the 1990s. Many studies found in scientific publications since the 1990s provide transport researchers with sufficient evidence to observe how both light and heavy rail transit systems affect real estate markets. Some of these studies examine the value of property located within close proximity to rail stations and then make comparisons with similar properties located further away from rail stations. This is based on the assumption that immediate locations are expected to have higher effects than locations further away. Researchers have also investigated the effect on property values in anticipation of a rail line before construction or service begins (see, for example, Knapp et al. 2001; van der Krabben and Needham 2008). Although the majority of studies have examined the increased value of residential property by being located closer to rail stations/line, there has also been research exploring the relationship between rail transit and commercial-office property values (see, for example, Landis et al. 1995; Chesterton 2000; van der Krabben and Needham 2008).

Table 1 Summary of the effect of rail transit systems on property values

Reflecting on Table 1, most studies both in Europe and North America suggest that proximity to rail transit systems increase property values but with varying magnitude. For instance, studies on the effects of rail transit systems on property values carried out in cities such as London (UK), Newcastle upon Tyne (UK), Sheffield (UK), Atlanta (US), Philadelphia (US), Boston (US), Washington. D.C. (US), San Francisco (US), New York (US), Portland, Oregon (US), Los Angeles (US), Ottawa (Canada) and North Carolina (US) have found a positive effect. Only studies carried out in cities such as Manchester (UK), Atlanta (US) and the San Francisco Bay Area have shown weak, mixed and no effect. Findings of these studies indicate that property located within close proximity to rail stations experienced up to a 25 % premium (see for example Debrezion et al. 2006). However, Chen et al. (1997) found negative premiums on property values that are located in immediate station areas and they have attributed this to nuisance effects, including noise, safety, aesthetic and traffic.

Another interesting characteristic of the effects of rail transit systems on property values is observed across various neighbourhood types—income and social divisions are common in the empirical literature. For example, consistent with surveys of travel behaviour (Redman et al. 2013), a study carried out by Nelson (1992) in Atlanta found that property values increased in low-income neighbourhoods whilst a study carried out by Gatzlaff and Smith (1993) in Miami found that only high-income neighbourhoods experienced increased property values. Gatzlaff and Smith (1993) argue that the variation in the findings of the empirical work is attributed to local factors in each city. Such inconsistencies, where research has failed to understand the workings of local factors, do little to help policy makers faced with a need to estimate likely value premiums ex ante.

The Estimation Methods

Building on a conventional global hedonic price model (HPM), a spatial econometric method called Geographically Weighted Regression (GWR) is used to calibrate local regression parameters by weighting the distance between one data point and another through the coordinates of data.

The Hedonic Price Model (HPM)

In order to estimate the effects of the LRT system on residential property values, this paper initially uses a standard HPM where residential property value is a function of nearby transport services (focus variableFootnote 2), structural or physical characteristics of a property, locational attributes and socioeconomic characteristics (free variablesFootnote 3) in which properties are located. As widely recognised, HPM is a method used to analyse a market for a single commodity with many attributes, in particular residential properties. This method is developed based on consumer theory that states the characteristics of any commodity determine its price. From a methodological perspective, HPM is a suitable method for this study since the estimates produced by the method can be used to interpret the importance of explanatory variables in defining the relationship between residential property value and light rail station proximity. The general form of a hedonic pricing model can be presented as:

$$ \mathrm{P} i= f\left(\mathrm{F},\mathrm{S},\mathrm{L}\right)+{\upvarepsilon}_{\mathrm{i}} $$
(1)

where,

P i :

the market price of property i

F:

a vector of focus variables

S and L:

the vectors of structural and locational variables

εi :

a vector of random error terms.

This has been termed the traditional hedonic specification.

Table 1 highlights how researchers mostly used selling price of residential properties or rent prices as the dependent variable within hedonic pricing studies to estimate property premiums from close proximity to rail station areas. To many researchers hedonic models have been considered to be the best method to investigate the relationship between rail transit systems and land value. However, this method is subject to criticisms ascending primarily from its insensitivity to take into account the spatial effects (spatial heterogeneity and spatial autocorrelation) of the relationship being studied.

Only recently have studies (see, for example, Du and Mulley 2006; Hewitt and Hewitt 2012; Crespo and Grêt-Regamey 2013; Mulley 2014) started to address the issues of spatial heterogeneity and spatial autocorrelation within HPMs. In all of these studies, the authors report GWR performed better than HPM as indicated by a higher adjusted R2, lower AIC, large differences in parameter estimates and a lower prediction error. Most importantly they have demonstrated that residential property price premiums varied in terms of the effect and magnitude across space particularly for the demand of explanatory variables that are spatial in nature. In other words, the empirical evidence provided by these studies have supported the presence of spatial heterogeneity which cannot to be identified within traditional HPMs. Moreover, unlike HPM estimation, the results of these studies also suggest that the error terms produced by GWR exhibit little or no positive spatial autocorrelation.

Geographically Weighted Regression (GWR) Approach

As mentioned above, many previous studies have made use of the HPM to estimate the effects of rail transit systems on property values. HPMs are mostly expressed in a traditional linear regression model using ordinary least squares (OLS), in which the regression coefficients represent the implicit price of each attribute (Orford 1999). However, as stated above one of the main problems in studying property values or specifically residential property values is to deal with spatial effects within the housing market; spatial heterogeneity and spatial autocorrelation. Spatial heterogeneity refers to relationships (measured by parameter estimates in regression modelling) that vary over a geographical area, whilst spatial autocorrelation refers to when a variable measured at a certain location is spatially correlated with the same variable located nearby (LeSage 1998). Although many past HPM studies attempted to control for spatial effects by increasing the sample size, including the locational and socioeconomic attributes, measuring proximity from a given residential property to amenities with distance, and applying HPMs to housing submarkets or to different types of properties, the nature of the spatial relationship between residential property prices and attributes was not explicitly modelled. In order to deal with spatial effects in the housing market, a group of techniques known as spatial econometrics have been proposed and developed by several researchers to enable the inclusion of spatiality within property models, such as the spatial expansion method (Casetti 1972), multilevel modelling (Goldstein 1987; Jones and Bullen 1993, 1994), spatial autoregressive model (also known as spatial lag model) (Anselin 1988) and more recently geographically weighted regression (GWR) (Brunsdon et al. 1996; Fotheringham et al. 1998, 2002).

Motivated by the necessity of addressing spatiality issues geographically (or locally) weighted regression (GWR) is used in this study. In contrast to the HPM where single parameter estimates is applied for the entire area, a key advantage of GWR is it essentially allows parameter estimates to vary across space, which can provide a way of accommodating the spatial context within which residential properties are located. The technique can also be regarded as an explanatory tool for developing a better understanding of the relationships being studied, that is, through mapping local parameter estimates. The use of GWR in this study is identified as a superior methodology, where Du and Mulley (2012: 49) argue that ‘although not widely used in transport studies, GWR has been identified as providing more rigorous analysis of change over other spatial analytical tools if its significant data demands can be met’. Since HPM provides a basis for GWR, by including longitude and latitude co-ordinates (ui, vi) to the equation (1) above, the general form of a hedonic pricing model can be mathematically expressed at location i in space as follows (Crespo and Grêt-Regamey 2013: p. 667):

$$ \begin{array}{cc}\hfill {\mathrm{P}}_{\mathrm{i}}\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)={\upbeta}_0\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)+{\displaystyle \sum_{\mathrm{k}=1}^{\mathrm{P}}{\upbeta}_{\mathrm{k}}}\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right){x}_{\mathrm{i}\mathrm{k}}+{\upvarepsilon}_{\mathrm{i}},\hfill & \hfill i=1,\dots, \mathrm{n},\hfill \end{array} $$
(2)

where,

Pi :

the response variable at point i

ui, vi :

the spatial coordinates of point i

β0 (ui, vi):

the location-specific intercept term parameter

βk (ui, vi):

the kth location-specific parameter

p:

the number of unknown local parameters to be estimated (excluding the intercept term)

x ik :

the kth explanatory variable associated with βk

εi :

a random component assumed to be independently and identically distributed

n:

the number of observations

Based on Eq. 2 above, location-specific parameters β k (u i , v i ) are estimated using weighted least squares and can be expressed as follows (Crespo and Grêt-Regamey 2013: p. 667–668):

$$ \begin{array}{cc}\hfill \overset{\frown }{\upbeta}\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right)={\left[{\mathrm{X}}^{\mathrm{T}}{\mathrm{W}}_{\mathrm{i}}\mathrm{X}\right]}^{-1}{\mathrm{X}}^{\mathrm{T}}{\mathrm{W}}_{\mathrm{i}\mathrm{p}},\hfill & \hfill i=1,\dots, \mathrm{n},\hfill \end{array} $$
(3)

where,

\( \overset{\frown }{\upbeta}\left({\mathrm{u}}_{\mathrm{i}},{\mathrm{v}}_{\mathrm{i}}\right) \) :

a (p × 1) vector parameter estimates at location i

X:

an (n × p) matrix of observed explanatory variables

Wi :

a distance decay (n × n) matrix

P:

an (n × 1) vector of observed response variables

Note that p and i are as defined in the Eq. 2 above. Location i is also denoted as the regression point; the point at which parameters are being estimated. As expressed in the equation above, the weighting of an observation is done through a distance decay matrix (W i ) so that observations located near to the point in space are weighted more than observations located further away. By this geographically weighted calibration, continuous and smooth surfaces of local parameter estimates can be mapped over the geographical area. The advantage of using GWR in comparison to other spatial methods such as multilevel modelling is that each observation is treated as an individual observation at a specific geographic point. Thus, the maps produced will not be limited within an artificially bounded geographical area such as political or administrative boundaries as normally required when modelling spatial data (Du and Mulley 2012; Crespo and Grêt-Regamey 2013). Whilst there are parallels in the model development between housing submarket analysis and this research (Leishman et al. 2013), moving away from global models and the simple use of dummy variables to account for differences across space, this research does not confine itself to the estimation of geographical boundaries. It is worth mentioning that, the results of GWR estimation are sensitive to the choice of bandwidth (the distance captured by the spatial kernels) used to determine a weigthing scheme and this paper uses adaptive bisquare spatial kernels which narrow the badwidth where data are dense but allows it to spread where data are spread. The Akaike information criterion (AIC) (Akaike 1973) is used as a measure of fit using the rule of thumb that the local model is an improvement over the global model if its AIC is more than three units smaller (Fotheringham et al. 2002).

The Study Area, Data Acquisition and Selection of Independent Variables

The Study Area

The Kelana Jaya Line LRT system located within Greater Kuala Lumpur, Malaysia, is chosen as the case study. The Greater Kuala Lumpur region is an area comprising Kuala Lumpur and its suburbs, and has been the most rapidly growing region and major financial and commercial centre in Malaysia. It encompasses an area of 2,843 km2 and had a population of about 6 million in 2010 (about 21.4 % of the total population of Malaysia). Following implementation in the 1990s some areas of Greater Kuala Lumpur region are now served by the LRT system, with the remainder of the area being served by bus. It is worth mentioning that the LRT system, in particular the Kelana Jaya Line LRT system serves the most prominent areas in Greater Kuala Lumpur region. For example, the Kelana Jaya LRT Line stations are strategically located at major financial and commercial centres, and heavily populated areas in Greater Kuala Lumpur such as PETRONAS Twin Towers (KLCC), Ampang Park, Petaling Jaya town centre, Wangsa Maju town centre and central market (see Fig. 1). Thus, it is an appropriate area to estimate the effect of the LRT system on residential property values.

Fig. 1
figure 1

The study area

For private transport, the Greater Kuala Lumpur region benefits from good arterial road access. The level of private vehicle ownership (car and motorcycle) in Greater Kuala Lumpur is the highest in the country. Following the Home Interview Survey carried out by Japan International Cooperation Agency (JICA) in 1998, the estimated possession ratio of vehicles represents approximately 211 cars per 1,000 population and 164 motorcycles per 1,000 populations. As the number of private cars and motorcycles in Greater Kuala Lumpur increases, the demand for commuting to and from the city centre tends to increase far beyond the capacity of the road network, even after upgrading to an existing road and the construction of new roads have taken place. As a result, traffic congestion has become a serious problem for Greater Kuala Lumpur, particularly in Kuala Lumpur City Centre.

This study required residential property transaction data, structural data and more importantly good quality locational data. Many data sources were explored before making the decision on data sources for the study.

Data Acquisition and Selection of Independent Variables

As with previous HPM studies, this study uses secondary data sources. These data can be grouped into two categories; residential property and locational attributes data.

Residential Property Data

Residential property price transaction data for 2005 were chosen to be the sample for this study. This marks a period after several years of rail transit systems operated in Greater Kuala Lumpur region. In total, 2338 units of house selling prices were collected. However, after going through several steps to clean the sample dataset by eliminating the unsuitable data and updating the unavailable data,Footnote 4 the study was left with 1,580 observations across the two housing areas (Petaling Jaya and Wangsa Maju) shown in Fig. 1. This cross-sectional data refers to the residential property located within 2 km (straight-line-distance) of LRT stations. Planners typically assume that people will comfortably walk approximately 800 m to reach transit stations (Unterman 1984). However, in this study, we expand the pedestrian access distance to a 2 km radius around stations, in order to capture the variation in property values not necessarily observed within the 800 m radius. The selling price of an individual residential property and its structural attributes were collected from the Department of Valuation and Property Services, Malaysia (Kuala Lumpur branch). The structural attributes of the residential property obtained from the data provider and used for the analysis are size of building (measured by floor area, number of bedrooms and bathrooms), size of lot area, and property types.

Locational Attributes Data

The data on the base map, land parcel and land use were obtained from the Department of Survey and Mapping Malaysia and the Department of Agriculture Malaysia. The data are believed to be of high quality and reliability as these data come from the professional body that provides maps and land use data in Malaysia. In order to measure the distance to locational attributes from a given residential property, a geographical information system (GIS) was used to organise and manage the large spatial datasets (that is, units of residential properties) and estimate the structural and locational attributes. Most importantly GIS was used to position each observation accurately on a local map by using land parcel number. Moreover, the combination of GIS and spatial analysis has been particularly useful because the proximity from observations to various locational attributes were measured accurately using network distance. The distance in metres was measured along the street network by using a GIS programmed named Multiple Origins to Multiple Destinations, obtained from the Environmental Systems Research Institute (ESRI) support centre.Footnote 5 The network-distance measurement using this programme requires three layers of spatial data; points of origin (observations), points of destinations (locational attributes) and the road network data. This allowed the shortest route from each observation to the locational attributes to be calculated. Furthermore, the Multiple Origins to Multiple Destinations programme allows more than one destination to be selected at any one time. Thus, proximity to locational attributes can be calculated simultaneously for each observation.

The travel time savings variables were calculated using timetables for LRT services and network timings for cars, as a competing mode. Several factors have been considered in measuring accurate travel time to the CBD by the LRT system; access time from a house to a LRT station, waiting time (peak LRT travel times) and in-vehicle travel time. The sources for waiting and in-vehicle travel time from each station are obtained from the resource centre of Rapid KL. Access times from a house to an LRT station were calculated using the shortest road distance to a LRT station at a walking speed of 80 m/min (O’Sullivan and Morrall 1996). By adding access time to an LRT station, waiting time and in-vehicle travel time, the total travel time to the CBD by LRT system was estimated. Travel times by car to the CBD were calculated by taking into account several factors namely, regular roads to the CBD for each observation and speed limits. The time that people take to travel to and from the CBD in the morning and evening were chosen since these are the two time periods which have been identified as peak periods during the day where people travel to and from work. As for regular roads to the CBD, the choice is based on the data obtained from the extensive study conducted by geography department at University of Malaya. Regarding speed limits it is important to consider barriers caused by serious traffic congestion to and from Kuala Lumpur city centre during peak periods. The speed limits for each road in this study were obtained from studies conducted by Mohamad and Kiggundu (2007) and the Ministry of Transports, Malaysia. Finally, the subtraction of travel time by car over LRT to the CBD yields travel time savings, and was then used as one of the key variables in principle analysis. It is vital to note that the analysis that has been carried out in this study has clearly indicated that the existence of the LRT system in Greater Kuala Lumpur has improved accessibility to and from Kuala Lumpur city centre.

A list of explanatory variables considered for inclusion in the HPM and GWR together with their descriptive statistics is given in Table 2. These explanatory variables were chosen according to the theory of transport improvement benefits and the results of previous studies (see, for example, Landis et al. 1995; Henneberry 1998; Cervero and Duncan 2002; Hess and Almeida 2007) and most importantly, on the basis of their availability. However, it is important to note that in all regression-based analysis some explanatory variables are often multicollinear. Therefore, estimating accurate and stable regression coefficients may be difficult (Tyrväinen and Miettinen 2000). To handle this problem Tyrvainen and Miettinen suggest that one can omit a highly collinear variable from the model, provided this does not lead to serious specification bias. Multicollinearity between explanatory variables used for inclusion in the final models was detected by employing Pearson’s correlation coefficient and variance inflation factors (VIF). According to Orford (1999), a Pearson’s correlation coefficient above 0.8 and a VIF above ten indicate harmful collinearity. This implies that a Pearson’s correlation coefficient of variables below 0.8 and a VIF value of variables below ten are desired since this will ensure the model does not face serious multicollinearity. This rule was applied within this study.

Table 2 A list of explanatory variables and descriptive statistics of the model’s variables

Empirical Results

The results of the hedonic price and GWR models using the above specification are presented below in two stages. The first part shows the results from the HPM and the second part shows the results from GWR model.

The HPM Estimation

The first stage of the estimation process using HPM is to choose the functional form which best portrays the relationship between a property’s market price and each of the variables describing its characteristics. In other words, the functional form is the exact nature of the relationship between the dependent variable (a vector of residential property) and the explanatory variables (such as structural and locational attributes). There were four common functional forms used in HPM; linear, semi-log, double-log and Box-Cox linear (Garrod and Willis 1992; Cropper et al. 1988; Palmquist 1984). Unfortunately, economic theory does not generally give clear guidelines on how to choose a particular functional form for property attributes (Tu 2000; Garrod and Willis 1992). However, Cropper et al. (1988) suggest that linear, semi-log, double-log and Box-Cox linear perform best, with quadratic forms, including the quadratic Box-Cox, faring relatively badly. Based on the advice given by Cropper et al. (1988), double-log specification was used to estimate the effects of the LRT system on residential property values in this study. The model is regressed on a set of determinants as follows:

$$ \begin{array}{c}\hfill \ln {\mathrm{P}}_{\mathrm{i}}={\upbeta}_0+{\upbeta}_1 \ln {\mathrm{NETDIST}}_{\mathrm{i}}+{\upbeta}_2 \ln {\mathrm{TIMESAVINGS}}_{\mathrm{i}}+{\upbeta}_3 \ln {\mathrm{FLRAREA}}_{\mathrm{i}}+{\upbeta}_4 \ln {\mathrm{BEDS}}_{\mathrm{i}}+\hfill \\ {}\hfill {\upbeta}_5\mathrm{TYPTRRDi}+{\upbeta}_6{\mathrm{TYPSEMID}}_{\mathrm{i}}+{\upbeta}_7{\mathrm{TYPDETCH}}_{\mathrm{i}}+{\upbeta}_8{\mathrm{TYPCONDO}}_{\mathrm{i}}\hfill \\ {}\hfill +{\upbeta}_9{\mathrm{TYPAPT}}_{\mathrm{i}}+{\upbeta}_{10} \ln {\mathrm{CBD}}_{\mathrm{i}}+{\upbeta}_{11} \ln {\mathrm{P}\mathrm{RIMARYSCH}}_{\mathrm{i}}+{\upbeta}_{12} \ln {\mathrm{SECONDARYSCH}}_{\mathrm{i}}\hfill \\ {}\hfill +{\upbeta}_{13} \ln {\mathrm{COMMERCIAL}}_{\mathrm{i}}+{\upbeta}_{14} \ln {\mathrm{HOSPITAL}}_{\mathrm{i}}+{\beta}_{15} \ln {\mathrm{LAKE}}_{\mathrm{i}}+{\upbeta}_{16} \ln {\mathrm{INDUSTRY}}_{\mathrm{i}}\hfill \\ {}\hfill +{\upbeta}_{17} \ln {\mathrm{FOREST}}_{\mathrm{i}}+{\upbeta}_{18} \ln {\mathrm{INSTITUTIONAL}}_{\mathrm{i}}+{\upvarepsilon}_{\mathrm{i}}\hfill \end{array} $$
(4)

where i is the subscript denoting each property; Pi is the price of property i in Malaysia Ringgit (MYR); ln is natural logarithm; NETDIST is the network-distance from the property to an LRT station measured in metres; TIMESAVINGS denotes travel time savings to the CBD when people travel with the LRT system; FLRAREA is the floor area of the property in square feet; BEDS is the number of bedrooms of the property; TYPxxx is a set of dummy variables that illustrate the type of house which are further described as follows:

  • TYPTRRD is 1 if the property is terraced, 0 otherwise;

  • TYPSEMID is 1 if the property is semi-detached, 0 otherwise;

  • TYPDETACH is 1 if the property is detached, 0 otherwise;

  • TYPCONDO is 1 if the property is condominium, 0 otherwise;

  • TYPAPT is 1 if the property is an apartment, 0 otherwise.

CBD, PRIMARYSCH, SECONDARYSCH, COMMERCIAL, HOSPITAL, LAKE, INDUSTRY, FOREST and INSTITUTIONAL are the network-distance from the property to Kuala Lumpur city centre, primary schools, secondary schools, commercial areas, hospitals, lakes, industrial areas, forests and institutional areas respectively. These variables are all measured in metres. Finally, β 0 ,…,β 18 denotes a set of parameters to be estimated associated with the explanatory variables (including the intercept term), and ε i denotes standard error of the estimation, which is assumed to be independently and identically distributed.

Table 3 presents the summary of the parameter estimates associated with the ‘best’ model for double-log specification together with a Monte Carlo significance test procedure for GWR model. In general, the model fits the data reasonably well and explained 81 % of the variation in the dependent variable. Within the final model all of the explanatory variables that influenced residential property values were significant at the 1 % level and have the anticipated positive and negative signs.

Table 3 Results of the global model and the Monte Carlo test for spatial significance (n = 1580)

Focus and free variables were incorporated in the final model on the basis of significant coefficient values and alleviating potential issues of multicollinearity. The implicit prices of the continuous explanatory variables were calculated by holding all other variables at their mean level. Thus, every metre away from the LRT station was shown to decrease the expected selling price of a residential property by MYR27.055 (USD7.119Footnote 6). In the case of the variable TIMESAVING, every 1 min saving to the city centre adds a premium of approximately MYR2,474.920 (USD651.295) to residential property value.

Among the structural attributes of properties, the most significant contribution is shown by the size of the property, measured by the floor area (FLRAREA). For every square-feet increase in floor area, the expected selling price of a residential property increases by RM430.143 (US113.196) of the mean price of the property. The greater magnitude of the effect of floor area was expected, since floor area is usually associated with the size of the property—this is consistent with most HPM literature.

Among locational attributes, the distance to CBD is the most statistically significant. The model suggests that for every metre away from the CBD residential property values are likely to decrease by about RM17.063 (US4.490) indicating strong evidence of the magnitude of the existence of a price gradient from the CBD in the monocentric model. The distance to recreational lake (LAKE) is the least statistically significant locational attributes. For every metre away from recreational lake, there is a small decrease in residential property value at the rate of approximately RM0.722 (US0.19).

Calibration of the HPM: GWR Estimation

As highlighted in the literature, the main contribution of the GWR technique is the ability to explore the spatial variation of explanatory variables in the model, where the coefficients of explanatory variables may vary significantly over geographical space. The analysis using GWR software presents two diagnostic types of information; the information for the HPM and GWR model—including general information on the model and an ANOVA (it can be used to test the null hypothesis that the GWR model has no improvement over the HPM).

In this analysis, the local model benefits from a higher adjusted coefficient of determination (adjusted R2) from 81 % in the HPM to 88 % in the GWR model and the Akaike Information Criterion (AIC) of the GWR model (−418.03) is lower than for the HPM (205.13) suggesting that the GWR local model gives a significantly better explanation, after taking the degrees of freedom and complexity into account.

As mentioned above, one of the advantages of GWR is the ability to explore the spatial variation of explanatory variables in the model. Based on a Monte Carlo significance test procedure, the GWR software can examine the significance of the spatial variability of parameters identified in the local parameter estimates. The results of these tests, shown in Table 3 above, demonstrate that there is highly significant (at the 5 % level) variation in the local parameter estimates for all explanatory variables.

Analysing the Spatial Variation of Parameter Estimates and T-Surfaces

All the local parameter estimates can be mapped but due to space limitations, this paper concentrates on NETDIST variable only since distance to the LRT station is the focus of this study. The best interpretation comes from maps of local parameter estimates alongside the maps of local t-ratio since the local t-ratio maps exhibit the local significance that accounts for the local varying estimate errors (Crespo and Grêt-Regamey 2013; Du and Mulley 2006; Mennis 2006). To assist the readers with the place names mentioned in text, various housing estate regions that are included in the sample of this study are labelled on Fig. 2a and b. The location of these housing estates is shown in Fig. 1, where the upper circled area is Wangsa Maju and the lower Petaling Jaya.

Fig. 2
figure 2

a Housing estates in Petaling Jaya. b Housing estates inWangsa Maju

Figure 3a and b shows the spatial variation over geographical area of a premium on residential property values provided by the LRT system. In these two figures, the local estimated parameters are shown as different colour points. It is clear from the maps that the estimated parameters exhibit considerable local spatial variation over geographical space.

Fig. 3
figure 3

a Map of the local parameter estimates associated with variable NETDIST in Petaling Jaya. b Map of the local parameter estimates associated with variable NETDIST in Wangsa Maju

Proximity to the Nearest LRT Station (NETDIST)

As a way of having a general overview of the relationship between the existence of the LRT system and residential property values in Greater Kuala Lumpur region, this subsection examines the spatial variation of value premiums over the geographical area associated with proximity to the nearest LRT station (NETDIST). The spatial variation over geographical area of the NETDIST parameter estimates and the associated t-ratio are depicted in Fig. 3a and b.

The results in Fig. 3a and b suggest the positive relationships between the existence of the LRT system on residential property values are found in the majority of the housing estates in Petaling Jaya, whilst most of housing estates in Wangsa Maju area exhibit unexpected results in which the existence of the LRT system has no impact on residential property values. To aid interpretation the locations and names of the stations has been included within Fig. 3a and b.

The findings within Petaling Jaya (Figs. 2a and 3a) can be explained through serious traffic congestion and the inefficiency of public transportation such as the bus services operating in the area. For instance, the waiting time for buses is between 30 and 60 min. The serious traffic congestion from Petaling Jaya to Kuala Lumpur and vice versa (particularly during peak hours) together with the inefficiency of public transportation has led to long travel times for residents.

The introduction of the LRT system in the late 1990s has brought great relief for many Petaling Jaya residents, particularly for those who have had to rely on public transportation (low and middle income residents). The services provided by the LRT system in the area have truly improved the accessibility to and from the city centre. Therefore, it is reasonable to expect that buyers of residential properties in Petaling Jaya were willing to pay a higher price to be located closer to an LRT station. These findings are in line with previous studies carried out in cities where greater access to employment and other amenities provided by the rail transit systems tend to show positive property value effects (see, for example, Nelson 1992; Voith 1991; Chen et al. 1997; Lewis-Workman and Brod 1997; Knapp et al. 2001; Cervero and Duncan 2002; Du and Mulley 2006; Hewitt and Hewitt 2012). In the case of the Wangsa Maju area, public transportation serving that area was of a good quality before the LRT system was introduced in the late 1990s. Moreover, Wangsa Maju itself is located very close to the city centre (approximately 10 min drive by car), and therefore the role of the LRT system as a mode of transport to the CBD is less important.

These findings will now be explored in more detail in terms of the spatial variation in the premiums. As can be seen from Fig. 3a and b, the capitalisation in expected selling price is found to vary significantly over the housing estates covered—expected selling price in some areas are found to be greater than estimated by HPM particularly for those residential properties located in Sections 7, 8, 21, 22, 51A and some houses in Section 14 in Petaling Jaya area (see Fig. 3a) and Taman Setiawangsa in Wangsa Maju area. For example, every metre away from the nearest LRT station in those four above mentioned sections in Petaling Jaya, reduced the expected selling price of a residential property by between MYR27.67 (USD7.28) and MYR35.60 (USD9.369) (orange and red dots) with a negative significant t-ratio. In the case of Taman Setiawangsa in the Wangsa Maju area, the expected selling price of a residential property provided an even larger difference of between MYR27.00 (USD7.105) and MYR55.678 (USD14.652) (red dots) with a negative significant t-ratio.

The results of the GWR calibration also reveals that the expected selling price of a residential property located in Sections 7, 8, 21, 22, 51A and some houses in Section 14 in Petaling Jaya area are found to be varied for every metre away from the nearest LRT station. For example, for every metre away from the nearest LRT station in Sections 22, 51A and some houses in Sections 14 and 8 (southern part), the expected selling price of a residential property was reduced by between MYR31.18 (USD8.21) MYR35.60 (USD9.37) (red dots) with a negative significant t-ratio. The presence of this spatial variation can be attributed to the high density economic activities around stations that served residents who live in Sections 22, 51A and some houses in Sections 14 and 8 housing estates (Taman Jaya, Asia Jaya and Taman Paramount LRT stations) compared to those who being served by stations in purely residential area such as in Sections 20, 21 and some houses in Section 14 (northern part) housing estates (Taman Paramount and Taman Bahagia LRT stations). This finding is in line with previous studies carried out in cities where vibrant LRT stations tend to show positive property value effects (see, for example, Debrezion et al. 2006; Hess and Almeida 2007).

The results of the GWR calibration show an unexpected sign in which the existence of the LRT system in the area has no significant impact on expected selling price of a residential property, and this can be observed for areas such as in Sections 1, 3, 10, 11 and 12 in the Petaling Jaya area and Desa Setapak, Taman Seri Rampai, Taman Bunga Raya, Wangsa Melawati, Taman Setapak, Taman Ibu Kota and Taman Setapak Inn in the Wangsa Maju area. As can be observed from Fig. 3a, the expected selling price of a residential property in Sections 1, 3, 10, 11 and 12 in Petaling Jaya increased for every metre away from the nearest LRT station of between MYR1.983 (USD0.522) and MYR20.59 (USD5.148) (light green and dark green dots) with an insignificant t-ratio. High income residents who occupy these areas who prefer to use their own means of transportation instead of public service contribute towards this key observation. The finding in Petaling Jaya area indicates that the positive relationship between the existence of the LRT system and residential property values are only found in low and middle income neighbourhoods. This is indeed in accordance with the findings of Nelson (1992) in Atlanta, who claim that property values increased in low-income neighbourhoods but not in high-income neighbourhoods. The reason is that low-income residents tend to rely on public transit and thus attach higher value to living close to the station.

Similar results were also observed for residential properties located in Desa Setapak, Taman Seri Rampai, Taman Bunga Raya, Wangsa Melawati, Taman Setapak, Taman Setapak Inn and Taman Ibu Kota in Wangsa Maju area. For example, for every metre away from the nearest LRT station, its price increased by MYR1.611 (USD0.424) to MYR72.799 (USD19.158) (yellow, light green and dark green dots) with a positive significant t-ratio. It is important to note that the inverse relationship found for these housing estates are due to several factors and these factors are individually unique over the respective area. The reason why residential property values in Desa Setapak housing estate have increased for every metre away from the LRT station is due to traffic congestion around station (Wangsa Maju LRT station). The traffic congestion problem around the station exacerbated by the lack of adequate parking space causes LRT commuters have to park their cars around the homes of local residents. Furthermore, this residential area is located too close to the LRT station. As a result, this residential area has become an undesirable area and ultimately leads to a decrease in residential property values. This is in line with the findings of Chen et al. (1997) where they found negative premiums on property values that are located in immediate station areas and they have attributed this to nuisance effects, including noise, safety, aesthetic and traffic. Another housing estate that experienced the similar problem is Taman Bunga Raya housing estate.

The increase in residential property values for every metre away from the nearest LRT station is also observed in areas such as Taman Setapak, Taman Setapak Inn and Taman Ibu Kota housing estates. This is due to several reasons—these areas are located just 5 km away from the CBD and directly connected to the CBD by a good main road. As such the LRT station adds little to the areas as they are also served by efficient bus services to the CBD years before the LRT system has been introduced, with the car and buses services being much more convenient.

Conclusion and Policy Implications

Previous hedonic pricing studies have provided estimates of value premiums of proximity to public transport services. Yet, an international literature review has demonstrated much inconsistency between studies in terms of the magnitude of the premiums and in some cases even the sign of the effect on house prices. Such inconsistencies do little to help policy makers faced with a need to estimate likely value premiums ex ante. Rather than comparing global estimates between studies, it is argued that more focus needs to be given to the reasons for such variation. Such research requires a greater understanding of the workings of local factors. With the increasing availability of spatial econometric approaches, which can explore the variation in values across the areas affected, this opens up new opportunities in terms of understanding. This spatial variation had been completely unseen in the previous global versions of the hedonic price method. This paper has considered within a LRT system, variation in premiums and the reasons for that variation.

Using a geographically weighted regression approach to estimate premiums across a LRT system within Kuala Lumpur in Malaysia, after controlling for other factors, this study has demonstrated wide variations of premiums across the study area. Consistent with previous cross-study comparisons, premiums were found to vary within-study from negative to positive depending on factors such as: the desirability of the area, income characteristics of the neighbourhoods, quality of the pre-existing transport systems, negative effects of poor parking facilities at stations and proximity to the CBD. These findings further enhance our understanding of local factors.

Such within-study variation, suggests ex ante value premium estimation to be challenging, where knowledge of previous factors and how they interact to affect premiums needs to be combined with an understanding of local circumstances within some form of advanced benefit transfer exercise. Given the cost of transport systems and the potential for them to poorly implemented, such evidence informed decision making would appear appropriate and crucial within design of such LRT systems.

Given the challenges governments face funding public transport, it is important to reflect on the implications of this research in terms of the potential to implement of Land Value Capture (LVC). Within the introduction to this paper three potential mechanisms were briefly described which could be used to fund or partly fund public transport investment. The first two mechanisms relate to recouping the costs of public transport through either an expected future increase in tax revenue or from internalising the benefits of infrastructure improvement through land sales following completion of the development (Medda 2012). In both cases there is the expectation that there will be a value premium from proximity to LRT. The results of this study have illustrated that such a premium is not inevitable and careful consideration is required into the routing of the LRT and the design/location of stations. The results within this study have added to understanding as to the factors upon which “success” is likely to depend. This remains an important area for future research. The third mechanism considered here for implementing LVR is a betterment tax; a ‘tax on the land value added by public investment’ (Medda 2012, p156). Consistent with previous findings of Nelson (1992), the results in this study suggest value premiums are likely to be most significant in low-income areas. In this context, a betterment tax based on expected benefits would be controversial. Effectively those least able to pay would need to contribute the most to the costs of public transport.

These findings suggest that whilst there is potential to fund public transport improvement partially through LVC measures, implementation raises both ethical issues in terms of who should pay and concerns relating to significant risks born by public agencies, where success and/or the degree of success is difficult to determine ex ante. Whilst ex ante predictions from generic hedonic pricing methods of other local transport schemes can be useful, such as used by van der Krabben and Needham (2008), the research presented in this paper suggests models using GWR can be much more informative. Unlike other approaches to spatial modelling, observations using GWR are treated individually at a specific geographic point. This research has demonstrated how this enhances the potential to produce detailed policy information, such that the nuances of “success” in public transport implementation can be given due concern.