Introduction

Background

Land use change modeling is an essential process for various urban-related applications. It provides baseline information required in the analysis of future urban growth and development (Li and Yeh 2002). Moreover, this process can assist planners and decision makers in the provision of optimum locations for community facilities and impede different social and environmental issues on the way of achieving sustainable development (Hathout 2002). Compact cities due to several characteristics such as higher urban density, revitalization and redevelopment of central parts instead of rural development, efficient public transportation, and proper community facilities provision are one of the most sustainable urban forms (Burton et al. 2003; Livingstone and Rogers 2003). These properties are seen to contribute to sustainable development in the sense of social, economic and environmental concerns. This paper in general presents the application of a land use change modeling process by employing the optimum parameters for compact urban development and evaluating these parameters with respect to urban development and land use changes. The analysis and results emphasize the city compactness factors assessment and evaluation, growth direction of the urban land use types and their impacts on the loss of existing agricultural fields.

Land use change modeling

Land use change is the result of the complex interaction of several issues, such as environmental, physical, political, economic and cultural (Houghton 1994; Medley et al. 1995). Understanding the reasons and rate of these changes is important because of their significant effects on the surrounding natural environment, air and water quality, local temperature, urban economy, as well as other social impacts (Bingham et al. 1995; Pijanowski et al. 2002). Generally, the basis for land use change models is four core principles (Koomen and Borsboom-van Beurden 2011), namely; (1) historical, which is based on previous trends of land use changes (Kuijpers-Linde et al. 2007); (2) suitability, which refers to the site assessment based on several related parameters (Bagdanavičiūtė and Valiūnas 2013; Abdullahi et al. 2014); (3) neighborhood, which considers neighborhood interaction of each cell (Kocabas and Dragicevic 2007; Li et al. 2008; Al-shalabi et al. 2013); and (4) actor interaction which deals with evaluating the interaction of several agents on land use changes (Matthews et al. 2007). Furthermore, Verburg et al. (2004), Heistermann et al. (2006), and Koomen and Stillwell (2007) categorized land use change models based on six main concepts; Markov chain, economic-based systems, agent-based systems, statistical analysis, cellular automata (CA), and artificial neural network (ANN). All of these concepts are always based on the aforementioned four core principles with the aim of translating the real world to a model. These concepts have been explained in detail by different researchers, such as Markov chain concept by Koomen and Borsboom-van Beurden (2011) and Corner et al. (2014), agent-based by Parker et al. (2003) and Grimm et al. (2006), statistical approaches by Verburg et al. (2004), ANN by Skapura (1996) and Pijanowski et al. (2002), and finally CA modeling by Li and Yeh (2002) and Li et al. (2008).

For these kinds of land use processing, accessibility of strong spatial data, processing tools, mapping environments (software) and methods are essential, which are strongly supported by GIS environments and remotely sensed data. Remote sensing provides spatial data with preferred coverage in reasonable time and cost-effective manner for urban-related applications (Tan et al. 2010; Hamedianfar and Shafri 2015). On the other hand, GIS can help in collecting, storing, organizing, analyzing, and illustrating spatial data for corresponding processes. Integration of GIS with remotely sensed data can enable tools for modeling and quantitatively measurement of landscape trends on high spatial scale and resolution. In addition, to these modeling support technology and system, there are several sophisticated computational techniques to perform the modeling process. Cellular automata is one the most common method in this field which is based on time systems and neighborhood interaction of surrounding pixels that affects the transition of specific land use category to other. This model has several advantages and characteristics for land use change modeling such as applying dynamic spatial variables during the iterative looping (Li and Yeh 2002), new aggregate centers (Wu 1998), fractal properties (White and Engelen 1993), and complex patterns from local interactions (Batty and Xie 1994). This model is also based on historical concept, i.e., the trend of changes in past has significant effects on future changes. Hagoort et al. (2008) and Norte Pinto and Pais Antunes (2007) prepared a comprehensive review of the history of cellular automata and its applications in urban researches. However, determining the factor values is one of the major problem with CA modeling (Kamusoko et al. 2009; Corner et al. 2014). Complexity of this model increases when several land use types are included in the model (Batty et al. 1999). Another important issue is how to define transition rules and model structures, which are generally application dependent (Li and Yeh 2002). To address these difficulties, the CA model is calibrated to ensure accuracy of model performance (Wu 1998; Li and Yeh 2002; Kocabas and Dragicevic 2007). Several studies have reported on the integration of CA modeling with various techniques, such as logistic regression (Arsanjani et al. 2013), analytical hierarchy process (Wu 1998), Bayesian network (Kocabas and Dragicevic 2007), ANN _ENREF_38(Li and Yeh 2002), Markov chain (Al-sharif and Pradhan 2013), and rough set theory (Wang et al. 2011).

Another statistical global parametric approach to define transitional rules is probability estimation using regression model of urban changes with respect to several urban parameters. This transition probability estimation can be implemented by the weights-of-evidence (WoE) model, which is based on Bayes rule of conditional probability. Tayyebi et al. (2014) stated that regression models provide better explanatory power and outperformed some methods such as ANNs when the functional relationships between the dependent and independent variables are known. Another advantage of WoE over other techniques is that, WoE calculates weight for each driving factor based on the occurrence and non-occurrence of the events in the study area. One of the main assumptions of this method is consideration of the importance of prior knowledge on the past events which can be used for the future occurrence (Regmi et al. 2010). In addition, in spite of simplicity and less time consumption in data acquisition and processing, this method is used successfully with respect to examining events, spatial relationships and the distribution of features (Dahal et al. 2008). This technique has been examined in different studies such as geological and mineral mapping (Gettings et al. 2004; He et al. 2010; Chen et al. 2013), natural disaster management (Althuwaynee et al. 2012; Tien Bui et al. 2012; Youssef et al. 2015), land use dynamic modeling (Maria de Almeida et al. 2003; Abdullahi and Pradhan 2016) and especially in compact city modeling (Abdullahi et al. 2015b). In fact, WoE assesses the level of evidences in supporting and contradicting the corresponding assumption (Dempster 1967; Shafer 1976). This model is applicable when enough information is available to evaluate the relative importance of evidential themes through statistical concepts (Bonham-Carter 1994).

Although several application models are used in land use change modeling, very few studies have directly applied or integrated compact city concept with land use change modeling process. For instance, Mubareka et al. (2011) introduced a composite index to characterize urban expansion based on the degree of compactness of urban land with 1-km resolution land use model. Li et al. (2008) introduced a method for modifying urban signatures for simulating compact development in large complex regions. However, the current study attempts to integrate compact city parameters with land use change modeling process to evaluate the relationship between these parameters and urban growth at the level of high spatial resolution land use map. In addition, this integration modeling approach evaluates the environmental perspective of compact development with respect to loss of agricultural and natural spaces. The proposed modeling is the integration of WoE, as a factor based with CA, as a cellular-based approach, to present effective cellular-based data-driven land use change modeling. WoE was applied to reveal the amount and trend of different land use changes using time series data. Additionally, this model assesses the level of importance of related factors in affecting the changes. These outputs were used to define the transitional rules for CA modeling and to project the future land use conversion of the study area.

Data and methodological process

Kajang City (21 km away from Kuala Lumpur, Malaysia) with total area of 60 km2 (Fig. 1) was selected as case study to examine the mentioned modeling approach. In recent years, because of adjacency to Kuala Lumpur, this city has faced unorganized and sprawl developments. An increasing proportion of brownfields and destruction of the farm lands are results of such sprawl developments. The western part of the city is mainly agricultural and forest lands. Therefore, the effects of growth and changes of various land use types can be adequately observed, particularly on the natural environment. Although many abandoned plots exist within the municipality, recent growth and development are occurring at the outskirts of the agricultural and rural areas. This study seeks to provide information regarding the degradation of the natural environment and the possible solutions toward compact urban development to local planning authority.

Fig. 1
figure 1

Kajang city, Malaysia

Model development

The overall data and modeling processes of this study is shown in Fig. 2. The municipality of Kajang City provided most of the data used for this processes (Table 1). Land use datasets of Kajang City with the scale of 1:5000 comprise seven land use categories (Fig. 3).

Fig. 2
figure 2

Flowchart of the first and second land use change modeling process

Table 1 Data and layers used for land use modeling
Fig. 3
figure 3

Land use map of years 2008, 2012, and 2015

The identification of urban changes driving factors starts with a conceptualized idea of compact city and its paradigms. Based on the literature, three general indicators describe urban compactness: density, mixed development, and intensity (Burton 2002; Abdullahi et al. 2015a). Urban density comprises of: population density, building density, residential density, and many others. Regarding the urban sustainability point of view, higher density is always better than less density. Land use diversity or mixed land use development consists of areas used for multiple functions and activities. Urban intensity refers to the characteristics of an area regarding facility distribution, accessibility, infrastructure, public transportation, and infill development. Several studies have shown that higher density, land use diversity, and efficient public transportation have essential effects on achievement of healthy and sustainable neighborhoods (Ding 2004; Gainza and Livert 2013; Gu et al. 2013; Song et al. 2013).

Furthermore, socioeconomic and physical properties of the Kajang City (Table 1) were included in the analysis. These characteristics are known to exert direct effects on growth and change of various land use types.

  • Land use maps: Land use and land cover maps are the most essential input for urban application projects. These data can be used to extract the trend of changes in the landscape. Therefore, these layers provide fundamental information for evaluation, analysis, modeling, and predictions of changes in the urban and rural areas. Three land use maps of study area were in temporal basis for year of 2008, 2012, and 2015 (Fig. 3). In addition, the master plan of Kajang City was collected and utilized to assure the compatibility of the model with local policy and decisions.

  • Road network: This variable is also essential for all type of urban applications particularly urban planning and development. All the urban land uses (residential, commercial, recreation, institutional, etc.) are connected to each other using various links by road or street networks. In addition, most of the community facilities, public transportation nodes (train, bus and taxi stations), public attractions, commercial buildings, and institutional and governmental offices are located on the major streets. Hence, it is important to notice that a proper accessibility to major roads supports public transportation and walking and cycling behaviors. The available road data of the study area consists of various types of roads which were required to remove small and secondary streets.

  • Public attraction points: In general, there are several daily attraction locations in a city (for instance malls, markets, etc.) which affect traveling of the local residences. Hence, consideration of these places in the analysis and modeling of an urban area is very important. Specifically, proximity to these locations and/or proper distribution of these land uses within municipality has several advantages from sustainable environment aspects. Information about this layer for Kajang City was produced by assessing daily traveling with respect to land use maps and collection of information from local planning authority.

  • Public transportation facility: Public transportation facility is a shared facility of local passengers which is available for general public. This facility is one of the essential necessities of urban areas which normally consists of several modes such as taxi, bus, and train. Proper planning and designing of transportation network in a community have several advantages. In regard to urban sustainability perspective, transit-oriented development (TOD) is one of the most common concept to achieve more sustainable neighborhoods. TOD refers to the high mixed land use area with availability of proper public transportation modes and stations. Hence, distribution of various land use categories of urban area in planning and development stage is very important because accessibility and a proper distribution of public transportation stations reduce the utilization of private vehicles. The transportation network of the selected study area is facilitated by taxi, bus and different train networks. Keretapi Tanah Melayu (KTM) commuter is one of the main transportation train system in Malaysia that has one station in Kajang City (south-central), and proposed Mass Rapid Transit (MRT) has several stations in central parts of the city.

  • Hazard map (flood zones): In addition to various social and economic aspects of the city, safety and security from flood and other natural hazards are also important aspects during planning and development. In case of Kajang City, proper care should be considered for flood zones which are existing along central north–south areas. As local council has reported, new construction projects in these zones should be prevented based on intensity of the hazard.

  • Population map: Evaluation and consideration of city population is an essential factor in urban applications. High and low population (or population density) are the main properties to characterize sprawling development and/or social sustainability of a neighborhood. There are numerous studies on the relation of population density and urban sustainability. The population data collected for this study includes detailed information about the local residence of Kajang City (JPBD-Department of Statistics).

  • Soil and geological maps: Evaluation of ground stability regarding soil and geological properties is another significant variable to measure the stability of land surface under urban structures. Therefore, these properties were utilized in the modeling process in the form of data layers (which were collected from the Department of Drainage and Irrigation and Department of Geoscience and Mineral Resources, respectively).

From the available data layers, three city compactness indicators of each land use map were assessed (Burton 2002; Abdullahi et al. 2015a, b). Urban density was evaluated by considering population, residential, building and road density of the Kajang City. Population density was computed by considering number of inhabitants of each zone per built up area of the corresponding zone.

$$ {\text{Population}}\;{\text{density}}\;{\text{of}}\;{\text{zone}}\;i = \frac{{{\text{Population }}\;{\text{of}}\;{\text{zone}}\; i}}{{{\text{Build}}\;{\text{up}}\;{\text{area}}\;{\text{of}}\;{\text{zone}}\;i}}. $$
(1)

For other aspects of density measurements, similar formula was implemented as follows:

$$ {\text{Residential}}\;{\text{density}}\;{\text{of}}\;{\text{zone}}\;i = \frac{{{\text{Number}}\;{\text{of}}\;{\text{residential}}\;{\text{units}}\;{\text{of}}\;{\text{zone}}\;i}}{{{\text{Built}}\;{\text{up}}\;{\text{area}}\;{\text{of }}\;{\text{zone}}\; i}} $$
(2)
$$ {\text{Road}}\;{\text{density}}\;{\text{of}}\;{\text{zone}}\;i \, = \frac{{{\text{Road}}\;{\text{length}}\;{\text{of }}\;{\text{zone}}\; i}}{{{\text{Total}}\;{\text{area}}\;{\text{of}}\;{\text{zone }}\;i}} $$
(3)
$$ {\text{Building}}\;{\text{density}}\;{\text{of}}\;{\text{zone}}\;i \, = \frac{{{\text{Built }}\;{\text{up}}\;{\text{area}}\;{\text{of }}\;{\text{zone}}\; i}}{{{\text{Total }}\;{\text{area }}\;{\text{of}}\;{\text{zone}}\; i}}. $$
(4)

The resulted maps were standardized in range of 0 (as less compact) to 1 (as more compact).

The diversity of land uses was assessed with respect to existence and proximity of various land use categories of the Kajang City. Although several scholars have studied about measurements of land use diversity, still no comprehensive and standard solution is proposed in the literature. Hence, researchers have measured the diversity of land uses using different techniques and data layers (Abdullahi et al. 2015b). However, this study has evaluated the mixed development using proximity and distribution pattern of residential, commercial, industrial, recreation and community facility land uses. These categories were separated as different layers and used in the following equations;

$$ {\text{LD}}_{\text{Prox.}} = \sum \limits_{i = n}^{m} {P}_{i} \cdot {V}_{i} $$
(5)

where LDProx is land use diversity using proximity concept. Pi is the proportional percentage of the landscape area (pixels) with corresponding Vi value. Vi is the level of proximity of each pixel with respect to other land use categories, which was assessed from overlaying process that creates overall proximity results. Vi ranges from n (number of the land use categories) to m (number of the land use categories times number of proximity classes). In this case, n is 5, and m is 25, which was computed from 5 (number of land use types) × 5 (number of distance classes). Similarly, LDProx is in the range of nm for each landscape represents single land use development to highly diverse development. The output proximity maps were standardized and accumulated to extract the areas located near several land use categories.

Urban intensity was evaluated by considering the availability, quality and quantity of different urban facilities, public transportation and infrastructure for the study area. Hence, these characteristics were evaluated using detail information of the inhabitants of each zone. In fact, same concept and equation of proximity analysis were used to evaluate every cells of the Kajang City with respect to urban intensity. However, in this stage, more complex considerations were involved in the process, and the analysis was not solely based on proximity concept. The demand of different services and facilities was assessed by considering the number and characteristics of local inhabitants using available guidelines (De Chiara 2001). Lastly, the values were standardized from 0 to 1 based on intensity evaluation. All these three compactness assessment process were performed based on cellular basis in order to evaluate every pixels of the study area related to each parameter and indicator.

The next important step is to analyze and understand the trend of each land use change during the given period. For this purpose, cross-tabulation was implemented for two maps of 2008 and 2012. Cross-tabulation is a mathematical-based matrix which evaluates the growth and changes in various land use types of the study area (Pontius and Millones 2011). The result of cross-tabulation analysis between two available land use maps indicated that residential, commercial, and industrial areas have significant growth compared with other land use types. These growths caused destruction of natural environments and valuable agricultural areas, as shown in Table 2. Moreover, cross-tabulation determined the proportional conversion of each land use, which was calculated from each area of change with respect to the total area of analysis.

Table 2 Cross-tabulation of land use map of 2008 versus 2012 (Hectare)

In addition to cross-tabulation, it was required to predict land use changes quantitatively using Markov chain method. Markov chain model is effective in determining the probability of land use conversions between two maps. This method is extensively applied to model the changes and examine the trend of land use/cover by summarizing the conversion into transitional area and transitional probability matrixes (Coppedge et al. 2007; Dadhich and Hanaoka 2011; Sang et al. 2011). In this research, the obtained matrixes were used to analyze and identify the scenarios of future land use changes based on land use maps of 2008 and 2012. However, this method did not deal with spatial aspects of change occurrence (Araya and Cabral 2010). The integration of cellular automata with Markov chain overcome this problem and provided a strong statistical, spatial- and temporal-based land use conversion model (Corner et al. 2014).

Model integration

To determine the transitional probabilities that cause land use changes as functions of several factors related to compact urban development, weights-of-evidence technique which is a probability assessment-based method was applied (Bonham-Carter 1994; Pradhan et al. 2010). Subsequently, these probabilities were used to extract and select particular cells that will be developed based on the priority rules estimated by cellular processing. In addition, Kajang City land use change process was investigated using first-order Markov chain process. Finally, both processes were integrated with the CA model to involve neighborhood tendency of change of the cells to the model.

Considering that not all of the available data and selected factors have significant effects on land use change occurrence, assessing the level of importance and creating a shortlist of the most effective factors became necessary. The WoE model at the first stage evaluates the frequency of occurrence and non-occurrence of the phenomenon, which, in this case, is the land use change with respect to independent factors. This stage which deals with relationships between land use changes and independent parameter was utilized to show the correlation among the input variables. Hence, the frequency of occurrence of residential, commercial, and industrial pixels in each parameter’s class was evaluated. The frequency of occurrence was computed using the area ratio of each land use with respect to each factor (consider 1 as the average value). A value higher than 1 denotes positive and a value less than 1 indicate negative correlation. Thus, if a factor does not show a proper trend through positive or negative correlation, the absence of effect on that specific land use can be assumed. Each of the related factors was classified into three standard classes with the appropriate range (or scale for distance-based factors), type (or categorical for nominal factors) and rank (for city compactness factors). In this manner, relationship between growth and reduction of each land use type of the study area and involved factors were assessed. The factors that showed positive or negative effect were extracted from the list of related factors and were applied to create probability map for selected land use categories. The growth probability map is calculated based on Bayes rule of conditional probability concept (Thiam 2005; Pradhan et al. 2010; Tien Bui et al. 2012). By overlaying each land use map (for instance residential category), on every factor layers, the amount of pixels in each class of the factors was determined. For these determined number of pixels N(L), containing occurrence of specific land use type (residential), and the total number of pixels of the study area (Kajang City), N(C), the prior probability of the residential occurrence in general is expressed by;

$$ P(L) = \frac{N(L)}{N(C)}. $$
(6)

Now, by considering the involved factors, C = (Ci, i = 1, 2, 3, …, n), if a number of pixels of residential land use in a specific layer is N(\( L \cap C \)), then the probability of residential growth is calculated using conditional probabilities (Bonham-Carter 1994);

$$ P(L |C) = \frac{P(L \cap C)}{P(L)} = P\left( L \right)\frac{P(C |L)}{P(L)}. $$
(7)

The output of this calculation is the value of C/S(C) which shows spatial relation among various land uses and factor’s classes. Positive and negative of this value shows high and low probability of occurrence of the selected land use types in the corresponding class of factor, respectively.

According to Tayyebi and Pijanowski (2014), one kind of multiple land use change modeling is to run the model by several binary classification that are solved using binary classifier. In the current study, WoE used the same concept by decomposing the model into several binary classifications, and then evaluates the growth probability of main land use types separately. Normally, for these kinds of classification change or growth of one class is evaluated with respect to all others (One-Verses-All), or it can consider all possible mutual binary classifiers between n available classes (All-Versus-All). These processes are lengthy and difficult to analyze specially for larger numbers of binary classifiers. However, in the current study, the WoE was run only for selected land use types, based on the cross-tabulation process.

Next, based on two matrixes computed from Markov chain model, specifically the transitional area matrix, CA-Markov integration was implemented to facilitate the application of the contiguity filter and consequently obtain the projection of the growth from 2012 to 2016. This filter developed spatial weighting factors, which were applied to each land use growth map that resulted from the WoE approach, in order to provide weights to areas that are proximate to existing land use as well as have higher probability to change. This filter ensured that the problem of Markov chain analysis (lack of spatial bases) could be overcome. Thus, land use change occurred based on related evidence and was not entirely random. With each pass, CA-Markov reweighted each land use growth map, as a result of the contiguity filter on each current land use type. Once reweighting was completed, the revised suitability map was run through the model to allocate one-fourth of the required land in the first run and two-fourths of the required land in the second run. The process was continued until the full allocation of land for each land use class is achieved. At the end of each run, land use types were masked, and the contiguity filter was run. Subsequently, the result was multiplied to each land use growth map to create the input for the new run. The transitional area matrix has the crucial role of controlling the land area that can be allocated to each land use type over the next 4 years.

In general, for validation of the proposed modeling, the quantitative evaluation of the level of similarity between the predicted and real maps is preferable. In this study, validation of the proposed modeling approach was performed in two stages. At the first, three land use probability of growth maps created from WoE model was evaluated by actual land use map of year 2015 using the area under ROC curve (relative operating characteristic) technique (Pontius and Schneider 2001; Pradhan and Lee 2010; Chen et al. 2013). This method estimates the spatial relationship between the projected and reference maps. In addition, the AUC is a calculation of the area under ROC curve and ranges from 0.5 to 1. A value of 0.5 indicates a random relationship between input maps and value near to 1 indicates high relationship between the input maps which is an ideal spatial agreement between modeled and actual land use maps.

In the second stage, Kappa statistic index was calculated to evaluate the entire projected map of year 2016 produced from proposed integration approach (CA-WoE). This process was performed to assess the validity and reliability of the projected map in terms of quantity and location of changes with respect to actual land use map of year 2015. Kappa index of agreement is a measure of proportional accuracy adjusted for chance agreement.

In addition to these validation processes, normal CA was also run without considering the effects of factor analysis using WoE approach. This process was applied on land use maps of years 2008 and 2012 to create the projected map for year 2016. This process was run to verify that the proposed integration model (CA-WoE) can generate more valuable and reliable information about the future pattern of the study area.

Finally, after validation of the proposed integration modeling approach for the land use map of year 2016, the process was run with input maps of year 2008 and 2015 to predict the future land use map for year 2022.

Results and discussion

Quantify land use change

Table 2 summarizes the overall land use changes from year 2008 to 2012. This table shows the area for each land use type that was converted to another type. The land use map of 2008 (columns) is cross-tabulated with the land use map of 2012 (rows). The interesting part of this matrix is the growing of residential land use on almost all other land use categories. The loss of 333 ha of natural environment and agricultural fields caused by these growths was serious disaster that could have been prevented. The growths of commercial and industrial areas have also caused the loss of 63 and 150 ha of natural spaces and agricultural fields, respectively. The last row and second last column of Table 2 show the sum of growth and loss of various land use type. However, last column of this table shows the overall change of all land use types with positive and negative value. Hence, it can be seen that Kajang City during the selected period of time has lost more than 535 ha of its natural and green environments. However, growth of agricultural fields is a good effort to revitalize the existing abandoned lands for food production and industries. In addition, residential, commercial, and industrial land use have the growth of 195, 74, and 84 ha, respectively. These land use transition reports clearly shows that residential, commercial, and industrial are the three main land use types that have grown more significantly compared with other types. For this reason, this research focused on these three land use types to predict future changes in Kajang City.

As previously explained, the optimization process was conducted to examine and extract the most effective factors. For instance, the results showed that increasing the distance from residential areas cause noticeable reduction in growth of this land use type. Similarly, a reduction in growth of residential areas can be observed because of proximity to the industrial area. Therefore, the distance to the industrial buildings has an inverse relationship with the growth of residential areas. However, soil and geological properties do not have any positive or negative effect on the growth of residential land use types. One reason for these neutral effects is the homogeneity and distribution pattern of soils or geology types in Kajang City. Therefore, these ineffective factors were excluded from further processing. In general, proximity to same land use types and accessibility (proximity to road networks and public transportation) were the most effective factors for the growth various land use types.

Land use change modeling using WoE

The WoE model calculated the cell transition probabilities for the three main land use types. The total number of cells in the study area was approximately 56,537,675 (1 m2 cell size). The total numbers of the residential, commercial, and industrial areas were 15,361,526; 1803,674 and 5,678,861, respectively. Summarized of the WoE calculation for residential, commercial and industrial land use growth is shown in Table 3. In this table, the value of C/S(C) indicates spatial relation of various land use cell and factor’s class. Similar land use categories increase the chance of growth of the corresponding category. Thus, as the distance from the residential area increases, the probability of residential growth decreases substantially. By contrast, there is tendency of separation between residential and industrial areas. However, adjacency to various community facilities is advantageous for residential areas.

Table 3 The WoE calculation results of residential, commercial, and industrial land use for 2008–2012 time periods

Moreover, in Kajang City, industrial areas are located in rural and undeveloped areas. Consequently, in the case of industrial land use, a negative value can be observed (Table 3) in proximity of roads and public transportation nodes. Result of the city compactness evaluations was in the range of high density, high intensity, and high land use diversity to low domain of these indicators. These results were evidences of the straightforward effects of city compactness on land use growth. Higher density, intensity and mixed development resulted in positive magnitudes of C/S(C) for residential and commercial areas. By contrast, areas with non-diverse land use types and low density and intensity are more suitable for industrial land use growth. Figure 4 illustrates growth probability maps of these land use categories. In general, it can be seen that, central parts of Kajang City have higher probability of growth for residential and commercial use; and eastern and western sides have higher probability of growth for industrial land use. However, residential areas have broader extend rather than commercial areas. The areas with higher probability of growth for commercial land use are mainly located along the main roads in central parts and passing the main public transportation of Kajang City (southern parts). In contrast, residential growth has higher probability in wider extend mainly in central parts. Industrial land use has higher probability of growth in western regions which are mainly covered by agricultural fields, and eastern parts near existing industrial buildings and open spaces.

Fig. 4
figure 4

Probability of growth maps of residential, commercial, and industrial for year 2016

Land use change modeling using CA-WoE

Markov chain analysis was accomplished by developing two transitional matrices. Although the transitional matrixes look accurate, the output map showed a salt and pepper appearance because of the lack of spatial distribution knowledge for each land use type. The transitional probability matrix computes the probability that each land use type will change to another type. This matrix was calculated from the cross-tabulation matrix by adjusting the proportional errors. The transitional area matrix computes the number of pixels that are expected to change to another type. This matrix (Table 4) was obtained by considering the transitional probability matrix values as well as the number of pixels for each land use type in the land use map of 2012.

Table 4 Transitional area matrix created from Markov chain analysis

Evaluating the historical land use changes from 2008 to 2012 based on compact city evidence provides strong input in the form of transitional rules to train the CA modeling. The CA predicted the transition of land use categories to each other, based on transitional area matrix as well as the probability of growth maps from WoE model. Figure 5 depicts the result of projected map for year 2016 based on the proposed integration approach of CA-WoE. In visual interpretation, adjacency of similar land use categories controlled the model more significantly than the other evidences. Large amounts of agricultural areas were converted to industrial land use in the central and western parts of the city. Moreover, in the central and northeastern parts, considerable conversion of agriculture to residential land use types can be observed. Therefore, given the strong effect of neighborhood cells, this parameter played a crucial role in the tendency of the central cell to transition to another. Other parameters, such as accessibility to main roads and public transportation, also had meaningful effects on commercial land use type. Most of the new commercial areas developed along the main roads. Thus, it can be concluded that the land use type with lower coverage is influenced by other parameters rather than the effects of neighborhood cells.

Fig. 5
figure 5

Projected land use map for year 2016 using proposed CA-WoE modeling approach

Interestingly, the effects of compact city parameters can also be observed on the projected land use map. Areas with higher degree of city compactness have growth in higher density (building and residential density) and more mixed land use, such as the southern, southeastern, and northwestern parts of the municipality. By contrast, the central, western, eastern, and northeastern parts of the municipality, although have considerable land use changes, but still these areas have lower diversity, density, and facility distributions.

Figure 6 illustrates the AUC results of comparison between probabilities of growth maps of year 2016 with actual map of year 2015. This graph shows the similarity assessment of three growth maps of residential, commercial, and industrial land use with actual map of these land use types for year 2015. All AUC value with more than 85% presents the fitness of the growth probability maps for all land use types. Commercial land use has lowest value due to complexity of predicting and involving more number of variables. In contrast, industrial land use mainly grows nearby the existing industrial building; or constructed in open spaces far from residential areas. Hence, it is less complicated to model and predict this land use types rather than other categories. In the next stage of validation process, the entire projected map of year 2016 was compared with land use map of year 2015. The results of three measures of Kappa statistic index of agreement were 0.92, 0.94 and 0.91 for Kappa for no information, Kappa for location and Kappa standard, respectively. Therefore, the results of both validation stages proved that the WoE performed well, especially in determining the factors that have significant effect on the changes of each land use types. Similarity results indicated that the proposed CA-WoE model by selection and analysis of important parameters as well as city compactness factors, can model the land use changes with reliable and acceptable accuracy.

Fig. 6
figure 6

Area under the curve (AUC) for land use probability of growth maps for year 2016

By running normal CA process (without factor assessment using WoE approach) on maps of year 2008 and 2012 to create projected map for year 2016 and its comparison with actual land use map (2015), it was observed that the projected map created from CA-WoE produced more informative and reliable results. Normal CA predicted the future pattern mainly based on simple rules about spatial adjacency effects and local relation between various land use categories. On the other hand, the results revealed that normal CA lacks the relationship and interaction between parameters. However, CA-WoE created links between various social, physical, environmental, etc., characteristics of the sites; thus, the model provides more realistic and behavior-oriented transitional rules in CA environment. In general, by using CA-WoE approach, due to capability of factors analysis and assessment, various scenarios and ideas can be examined and proposed.

After confirmation about the validity of the modeling approach, the process was applied for land use map of year 2008 and 2015 to create the future map for year 2022. Figure 7 depicts the predicted map for year 2022. In this map, similar to previous growth trend of the city, significant growth of commercial land use along the main roads especially around train station can be observed. New industrial buildings have grown in vicinity of previous industrial sites in central-west areas. The loss agricultural fields in these areas due to growth of industrial land use can be seen clearly. In northern parts of the city also a gradual growth of industrial parcels can be observed, which can be due to inside or outside of industrial existence effects. Same condition occurred for residential land use which has growth near to existing residential area in entire city. It can be seen that, due to growth of urban settlement, the loss of agricultural and natural environments insider and near urban areas cannot be absolutely stopped. However, by controlling the growth direction especially through abandoned lands and brownfields exist inside the border of the city (with the aim of brownfield revitalization and redevelopment which is one of the objectives of compact city) the loss of these green and productive environments can be reduced and controlled significantly.

Fig. 7
figure 7

Projected land use map for year 2022 using proposed CA-WoE modeling approach

Conclusion

Unorganized and fast urban growth has caused widespread loss of valuable agricultural and green fields, particularly in underdevelopment countries and tropical regions. In this context, projection of growth and changes of urban areas according to sustainable development paradigms such as compact city is crucial because it provides advantageous information and vision for local planning authority. Nevertheless, modeling and projection of changes in urban land uses are complex and complicated processing because of several difficulties and uncertainties existed in urban system. However, these issues can be addressed with the use of multidisciplinary geospatial techniques and systematic approaches.

This paper presents the results of the application of a hybrid model incorporating CA as a cellular-based approach and probabilistic WoE as a factor-based approach to predict future land use changes. The developed model has the following strengths and benefits: (1) The model is based on real trends of changes in Kajang City. The evaluation of historical trends showed the significant growth of residential, commercial, and industrial areas rather than other. (2) The model is based on several parameters related to city compactness and urban issues. Analysis of these parameters proved the essential role of the model in understanding the spatial structure of land use changes. (3) The calculation of parameter weights (evidence) is based on statistical and historical analyses instead of the subjective choice of weighting parameters. For this reason, the maps obtained using the WoE showed better and more reliable results rather than the map created by expert knowledge or simple weighting techniques. (4) The model is dynamic and considers the spatial complexity of the problem. The CA model normally incorporates simple rules regarding spatial neighborhood effects that govern the system dynamics to determine land use changes.

Hence, the integration of the CA model with WoE successfully established the functional relationship between important parameters and development of the various land use categories. In fact, WoE provided a simple and straightforward approach for the selection of effective factors and then statistically utilized them to assess the growth probability for various land use types. It is concluded that the integration of this model with GIS-based CA is a strong approach for modeling land use changes for spatially complex urban areas.

Finally, the results showed strong neighborhood effects and spatial autocorrelation of urban patterns. Majority of the land use types revealed the tendency to expand right next to already existing same land use types. In addition, some land use types were preferably located at a distance from other land use types, such as residential and industrial areas, which led to negative spatial autocorrelation. This important structural spatial dependency provides a channel for the formulation of valuable guidelines in understanding land use change modeling. Moreover, the final outputs provide a proper perspective about the potential effects of urban changes on rural areas, which can be used as reference for local planners and resource managers. Specifically, the projected land use maps indicated that the growth of land use types based on the compact city principle could decrease the loss of agricultural fields and result in a more sustainable city development.