Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach

Napoli, Grazia; Giuffrida, Salvatore; Valenti, Alberto

doi:10.1007/978-3-319-49676-4_14

Grazia Napoli⁵,
Salvatore Giuffrida⁶ &
Alberto Valenti⁷

Part of the book series: Green Energy and Technology ((GREEN))

878 Accesses
10 Citations

Abstract

The analysis of the housing market of a city requires suitable approaches and tools, such as data mining models, to represent its complexity which derives on many elements, e.g. the type of capital asset-house is a common good and an investment good as well, the heterogeneity of the urban areas—each of them has own historical and representative values and different urban functions—and the variability of building quality. The housing market of the most densely populated area of Palermo (Italy), corresponding to ten districts, is analyzed to verify the degree of its inner homogeneity and the relations between the quality of the characteristics and the price of the properties. Five hundred sets of housing data have been collected and elaborated by cluster analysis with the aim of describing the structure of the housing market in each district and developing operational tools for the implementation of urban policies and public-private investments.

Access provided by CONRICYT-eBooks. Download chapter PDF

Clustering Analysis in a Complex Real Estate Market: The Case of Ortigia (Italy)

An Analysis of the Housing Market Dynamics in the Italian Municipalities

Machine Learning and Data-Driven Approaches in Spatial Statistics: A Case Study of Housing Price Estimation

Keywords

1 Introduction

This paper constitutes a part of a larger research on the urban real-estate market concerned with the analysis of its general structure and the monitoring of its evolution during the last few years, when the economic and property crisis, along with the change in the fiscal system, have deeply impacted on the readiness to invest, the competition between different types of assets, and the very idea of urban real-estate capital as well (Breuer and Nadler 2012; Gabrielli et al. 2015; Giuffrida et al. 2015). The analysis applied to the housing market in Palermo (Italy) can provide an informative base for the public investment decision processes, and the implementation of planning policy and public/private negotiations, especially in a period in which the map of the urban values is going to be reconfigured because of significant modifications of the public transport system (streetcars and subway).

Assuming the district as the minimum spatial unit, five hundreds sets of data of houses were collected in ten urban districts corresponding to a wide, densely populated, and complex area of the city. With the aim of managing and organizing the collected data, the study applies the cluster analysis approach to provide different hypotheses of articulation of the real estate market into submarkets expressing the characteristics of the properties. In such a way, the study intends to express the fluid and mutable relationship that relates the objects (the properties having their own characteristics) to the hypothetical model that is meant to represent them. The concept of homogeneity is mostly related to the topographical demarcation of the district and the peculiar characteristics of the properties that are afferent to the same submarket. This study has continued to review this concept as necessary to be employed as a unique interpretative scheme for a systematic reading of the real estate phenomenon, by the means of data-mining models or big-data management (Case et al. 2004; Fik et al. 2003).

The cluster techniques, developed in other scientific sectors, have been effectively used for the implementation of the mass appraisal in the fiscal equalization (Nesticò et al. 2014) and the fair-land planning (Giuffrida et al. 2014), where the direct or “phenomenological” approach (which is generally applied for a case-by-case valuation) must be modified to represent the structural tendencies of the market and how the real estate values react to specific or areal transformations within the city (Chan et al. 2012; Gabrielli 2013; Hepşen and Vatansever 2012).

The study also provides the occasion for a few methodological remarks on the representation of the real estate market, at the property and urban level as well. It also shows some of the major difficulties in the development of a standardized informational support that makes possible systematic analysis of the observations and the comparisons between different urban districts.

2 The Real-Estate Market Survey

This study analyzes the area of Palermo corresponding to ten districts having varied historical, representative, and functional qualities because of the time of their establishment and the most recent urban transformations. The boundaries of the area are the Mediterranean coast on the East, Regione Siciliana Street on the South and West, and Mount Pellegrino on the North. This area covers about 48 km² (30% of the municipal land) where 55% of the population is located (about 370,000 people). The districts are the followings: Q1 Settecannoli-Brancaccio is a working-class suburb; Q2 Oreto-Stazione, Q5 Montegrappa-S. Rosalia, Q6 Cuba-Calatafimi, Q7 Zisa-Noce, and Q10 Malaspina-Palagonia are low and medium income districts located near the city center; Q3 Tribunali-Castellamare is a part of the historic center where there is a mix of social classes; Q8 Politeama, and Q9 Libertà constitute the city “center”, that was built between the end of the 19th and the beginning of the 20th century, where high-income households live; and Q11 Resuttana-S. Lorenzo is a middle-class suburb (Fig. 1).

The market survey collected 500 residential properties for sale located in the abovementioned districts in 2014 (Fig. 1). The data sample describes the houses by four types of characteristics ($k_{e}$ location, $k_{i}$ intrinsic, $k_{t}$ technological, and $k_{a}$ architectural) (Forte 1968) that are organized in 28 quantitative and qualitative attributes as shown in Table 1. The data sample also contains the asking prices, and the prices per square meters, as well as per room. Each attribute is expresses in a standard scale (from 1 to 5) so that the scores are aggregated at the characteristic level and, afterward, the overall quality k* is achieved by calculating the weighted average score of each house.

Table 1 Characteristics and attributes of the houses

Full size table

Due to the generalized inconsistency of the coefficients of the four regressors calculated by the multiple linear regression for each district, the weights for the score aggregation have been calculated on the basis of the mean of the most significant coefficients in each district, having excluded the negative or the highest ones. Moreover, the weights $\lambda_{j}$ of each district have been iteratively varied within the ranges shown below to obtain the maximum value of R ². Assuming $\sum\nolimits_{j} {\lambda_{j} } = 1;j = e,i,t,a$, the ranges are: $0,10 \le \lambda_{e} \le 0,30;\,0,10 \le \lambda_{i} \le 0,30;\,0,30 \le \lambda_{t} \le 0,50$; $0,10 \le \lambda_{a} \le 0,30$.

A sample of the database for the Q3 district is shown in Table 2.

Table 2 Database for the Q3 district

Full size table

3 Clustering Methodology

Especially when the real estate is widely heterogeneous, data-mining procedures may be applied to achieve a consistent articulation of the real estate market in submarkets that represent the similarities between objects described by a standardized set of shared characteristics. The cluster approach can be classified into these types: hierarchical, non-hierarchical (partitions), grid-based, and model-based.

The k-mean algorithm, which belongs to a non-hierarchical method (Jardine and Sibson 1968), has been applied to the dataset formerly described. The output of this algorithm is the optimal partition of elements that maximizes a certain objective function, and it is based on the assumption of distributing the elements of a sample over a predetermined number of groups (King 2014; Everitt et al. 2011; Kaufman and Rousseeuw 1990). The number of possible partitions p $\left( {{{p}} = 2^{{({\text{n}} - 1) }} - 1} \right)$ can be reduced through the initial choice of the number of groups of the partition and, consequently, the optimal partition can be constructed among those partitions having the chosen number of groups, using a criterion depending on the algorithm applied.

The k-means algorithm forms k groups using certain values as initial centroids and placing the elements into groups on the basis of the maximum proximity to the centroids (proximity is measured using the Euclidean metric). Once the first partition has been computed, the new centers are recalculated; the previous routine is modified in the subsequent routine, until convergence is obtained (the condition is that each element is assigned to the same group as in the previous partition). When this condition has been verified, the optimal partition will have been thus obtained (Steinley 2003, 2006) (Table 3).

Table 3 Steps of the k-means algorithm

Full size table

Some problematic aspects of the iterative k-means procedure mainly regard the choice of both the initial centroids and the number of groups G. The choice of the initial centroids is the starting point from which the search for the final partition begins. If there are no specific indications regarding them, an internal algorithm of the software (IBM SPSS) will elect the centroids between the elements of the sample, so that they are well spaced. Alternatively, the analysis can be performed many times, and the final partition will be the one that is more consistent in respect to the information in the dataset resulting from cognitive domains. Regarding the number of groups g, if it is not available a priori on the basis of the dataset, then the procedure can be applied several times by varying g $\left( {g = 2,3, \ldots } \right)$ and choosing the value of g according to the CH Calinski-Harabasz index (Milligan and Cooper 1985; Yanchi et al. 2010). The CH index is calculated in the following way:

$$CH(g) = \frac{B(g)/(g - 1)}{W(g)/(n - g)}\quad B(g) = \sum\limits_{i = 1}^{g} {d\left( {\bar{x}_{i} ,\bar{x}} \right)} ;\quad W(g) = \sum\limits_{i = 1}^{g} {\sum\limits_{{j:x_{j} { \in }C_{i} }} {d\left( {x_{j} ,\bar{x}_{i} } \right)} }$$

(1)

where: B is the external deviance (between the groups); W is the internal deviance (within the group); g is the number of groups; $\bar{x}_{i}$ is the mean value of the observations belonging to the i-th cluster $C_{i} ;\,\bar{x}$ is the is mean value of the entire sample; ${\text{x}}_{\text{j}}$ is the j-th observation; d is the Euclidean metric; and n is the number of observations. Obviously, the more this index increases, the more the validity of the partition improves, since it represents the ratio between the external variance and the internal variance of the partition.

4 Application of Cluster Analysis

The cluster analysis (k-mean algorithm) is applied to the data sample by deciding in advance that the numbers of the clusters are equal to 3, 4 and 5—because of the limited variability of the overall quality in each district—and leaving the software to make the choice of the initial centroids.

Figure 2 shows the resulting values of the CH index and the number of clusters (best partitions) for which the CH index is maximized for each district:

3 clusters for the Q2, Q3, Q5, Q8, and Q9 districts;
4 clusters for the Q7 district;
5 clusters for the Q1, Q6, Q10, and Q11 districts.

In general terms, the resulting clusters are sufficiently representative of the local housing market: the suburbs, such as Q1 and Q11 districts, have a high degree of inner heterogeneity caused by various land uses (residential, industrial, and a shopping center) and by various states of maintenance of the buildings, and this complexity can be better expressed through numerous groups of properties—in this case, 5 clusters. —The central districts, such as Q3, Q8, and Q9, are instead quite homogeneous because they originate from the same period of the urban fabric and the analogous typologies of buildings, and they may be described through 3 clusters only.

However, by observing the relations between overall quality and prices in the scatter graphs (Fig. 2), we note that significant differences between the districts having the same number of clusters may occur. By comparing the Q1 and Q11 districts, for example, it can be noted that the price elasticity with respect to the overall quality is very low in the first district, whereas it is high in the second one. The low price elasticity may be explained through the fact that Q1 is a working class suburb where the lack of public facilities stops any price increase, even if the intrinsic and technological characteristics have good quality. Otherwise, by comparing the districts with three clusters, the data points in the Q2 and Q8 districts are quite close to the trend line, whereas, in the Q9 district, the data points are much more spread, so that the market prices differ greatly in correspondence to the same overall quality.

If the partition of the Q1 district (5 clusters) and the data set of the properties involved are examined with a greater detail (Fig. 3), it has been found that:

the clusters 1 and 2 represent two groups of similar properties as all of them have the same value of k _e and k _a, whereas the first group has k _t higher and k _i lower than the second group’s corresponding k (and vice versa);
the clusters 3 and 5 are also comparable except for the k _i;
the properties in the cluster 4 have the lowest prices and the worst characteristics of the district.

In the partition of the Q8 district into 3 clusters (Fig. 4; Table 4):

Table 4 Statistical results in the Q8 district

Full size table

the cluster 1 is very homogeneous, in fact the characteristics of all properties have the highest quality and the correspondent prices are higher than the mean price;
in the cluster 2, the properties may have a low score for each k and especially k _e is very low because of their location in the blighted area of Borgo Vecchio, or they may have a high value of the location k _e and a low value of the others k, and, in this latter case, the prices rise because the market recognizes the location to produce a marginal price higher than the ones of the others features;
the cluster 3 includes the properties with intermediate characteristics.

5 Conclusions

The results of the cluster analysis revealed that the housing market in each district has its own degree of complexity and peculiar relations between the market prices and the clusters representing the housing characteristics. The best number of clusters, chosen on the basis of Calinski-Harabasz index, expressed the inner variable heterogeneity of each district and represented the urban complexity.

The relationships between asking price and characteristics can significantly vary within the same cluster even when the characteristic quality is almost equivalent, and this fact is indicative of the typical information asymmetry and opacity of the real estate market and, moreover, of the current uncertainty and instability of the social and economic system, so that the owners of the real estate capitals express dissimilar expectations of the capital gains or losses (plus-minus valorization), translating them into different bid prices (Rizzo 1999).

Cluster analysis may be a useful tool to manage and analyze big data for describing, even in not exhaustive way, the structure of the real estate market, because this approach can select homogeneous groups of properties, reduce the degree of intrinsic complexity of the urban property data, and build a knowledge system to support the implementation of urban policy.

References

Breuer W, Nadler C (2012) Real estate and real estate finance as a research field. an international overview. J Bus Econ 82:5–52
Article Google Scholar
Case B, Clapp J, Dubin R, Rodrigues M (2004) Modeling spatial and temporal house price patterns: a comparison of four models. J Real Estate Finance Econ 29:167–191
Article Google Scholar
Chan L, Ng HT, Ramchand R (2012) A cluster analysis approach to examining Singapore’s property market. property markets and financial stability. BIS Pap Bank Int Settlements 64:43–53
Google Scholar
Everitt B, Landau S, Morvene L, Stahl D (2011) Cluster analysis, 5th edn. Wiley, New York
Book Google Scholar
Fik TJ, Ling DC, Mulligan GF (2003) Modeling spatial variation in housing prices: a variable interaction approach. Real Estate Econ 31:623–646
Article Google Scholar
Forte C (1968) Elementi di estimo urbano. Napoli
Google Scholar
Gabrielli L (2013) Cluster analysis and Italian real estate market analysis during the downturn. In: 20th Annual European Real Estate Society Conference, ERES European Real Estate Society
Google Scholar
Gabrielli L, Giuffrida S, Trovato MR (2015) From surface to core: a multi-layer approach for the real estate market analysis of a Central Area in Catania. In Gervasi O et al. (eds) Computational science and its applications—ICCSA 2015, vol III. Springer, Heidelberg, pp 284–300
Google Scholar
Giuffrida S, Ferluga G, Valenti A (2014) Clustering analysis in a complex real estate market: the case of Ortigia (Italy). In Murgante B et al. (eds) Computational science and its applications—ICCSA 2014, vol III. Springer, Heidelberg
Google Scholar
Giuffrida S, Ferluga G, Valenti A (2015) Capitalization rates and “real estate semantic chains”. An application of clustering analysis. Int J Bus Intell Data Min 10(2):174–199
Article Google Scholar
Hepşen A, Vatansever M (2012) Using hierarchical clustering algorithms for turkish residential market. Int J Econ Finan 4:138–150 [online]. Doi: 10.5539/ijef.v4n1p138
Jardine N, Sibson R (1968) The construction of hierarchic and non-hierarchic classifications. Comput J 1:177–184
Article Google Scholar
Kaufman L, Rousseeuw P (1990) Finding Groups in Data: an introduction to clustering analysis. Wiley, New York
Book Google Scholar
King R (2014) Cluster analysis and data mining: an introduction. Mercury Learning & Information, Herndon
Google Scholar
Milligan G, Cooper M (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50:159–179
Article Google Scholar
Nesticò A, De Mare G, Galante M (2014) The sustainable limit of the real estate tax: an urban-scale estimation model. In Murgante B et al. (eds) Computational science and its applications—ICCSA 2014, vol III. Springer, Heidelberg
Google Scholar
Rizzo F (1999) Valore e valutazioni. La scienza dell’economia o l’economia della scienza. FrancoAngeli, Milano
Google Scholar
Steinley D (2003) Local optima in k-means clustering: what you don’t know may hurt you. Psychol Methods 8:294–304
Article Google Scholar
Steinley D (2006) Profiling local optima in K-means clustering: developing a diagnostic technique. Psychol Methods 11(2):178–192
Article Google Scholar
Yanchi L, Zhongmou L, Hui X, Xuedong G, Junjie W (2010) Understanding of internal clustering validation measures. In: IEEE international conference on data mining. http://datamining.it.uts.edu.au/icdm10/index.php. Accessed on 18 Sept 2015

Download references

Acknowledgments

The work was discussed collectively by the authors in all its parts. However Grazia Napoli has edited paragraphs 2, 4 and 5; Salvo Giuffrida, paragraphs 1 and 5; and Alberto Valenti, paragraph 3.

Author information

Authors and Affiliations

Department of Architecture, University of Palermo, Palermo, Italy
Grazia Napoli
Department of Civil Engineering and Architecture, University of Catania, Catania, Italy
Salvatore Giuffrida
Professional, Palermo, Italy
Alberto Valenti

Authors

Grazia Napoli
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Giuffrida
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Valenti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grazia Napoli .

Editor information

Editors and Affiliations

Università IUAV , Venezia, Italy
Stefano Stanghellini
Politecnico di Bari , Bari, Italy
Pierluigi Morano
Politecnico di Torino , Torino, Italy
Marta Bottero
Politecnico di Milano , Milano, Italy
Alessandra Oppio

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Napoli, G., Giuffrida, S., Valenti, A. (2017). Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach. In: Stanghellini, S., Morano, P., Bottero, M., Oppio, A. (eds) Appraisal: From Theory to Practice. Green Energy and Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-49676-4_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-49676-4_14
Published: 31 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49675-7
Online ISBN: 978-3-319-49676-4
eBook Packages: EnergyEnergy (R0)

Publish with us

Policies and ethics

Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach

Abstract

Similar content being viewed by others

Clustering Analysis in a Complex Real Estate Market: The Case of Ortigia (Italy)

An Analysis of the Housing Market Dynamics in the Italian Municipalities

Machine Learning and Data-Driven Approaches in Spatial Statistics: A Case Study of Housing Price Estimation

Keywords

1 Introduction

2 The Real-Estate Market Survey

3 Clustering Methodology

4 Application of Cluster Analysis

5 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach

Abstract

Similar content being viewed by others

Clustering Analysis in a Complex Real Estate Market: The Case of Ortigia (Italy)

An Analysis of the Housing Market Dynamics in the Italian Municipalities

Machine Learning and Data-Driven Approaches in Spatial Statistics: A Case Study of Housing Price Estimation

Keywords

1 Introduction

2 The Real-Estate Market Survey

3 Clustering Methodology

4 Application of Cluster Analysis

5 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation