Introduction

The human beings have altered the face of the earth for a few centuries but with the invention of machines, the land cover dynamics have been drastically changed. Human activity and their interaction with the earth’s surface to obtain the necessities of daily life have led natural land cover to become man-made land cover (Hamad et al. 2018). These activities are considered the main factor in changing the use of land areas from arable to built-up and in altering the natural environment both in urban and suburban areas (Mallick et al. 2008). Additionally, the extent of urbanization determines the pattern of change in land use land cover. The problem of conversion and modification of LULC can be addressed by modeling that has become a thrust area in the recent times, which also drew the international attention on it because of concern over issues such as climate change and global warming. The fragmentation and spatiotemporal LULC change patterns were analyzed by determining the extent and rate of changes in Valencia Municipality (Yesserie 2009). Land cover change is determined to correlate the demographic details of Maverick County by Muhlestein (2008) and the study emphasized the use of satellite imagery as a strong tool in analyzing and interpreting earth systems. Reis (2008) investigated the LULC changes in Rize, North-East Turkey, as per the terrain variability (slope and altitude) by using GIS functions. It was observed that the LULC changes mostly occurred in coastal areas and in areas with low slope. Yu et al. (2011) analyzed and modeled the long-term LULC change in Daqing City using a system dynamic model. The trend of the analysis identified three driving forces, including population growth, land use management, and socio-economic policies responsible for the change. Halmy et al. (2015) identified the LULC changes of the northwestern desert in Egypt using Markov-cellular automata (CA) integrated approach to predict future changes. Markov model uses the transition probability matrix between different temporal images to predict the future event and the trend of its development (Naboureh et al. 2017). CA model simulates complex spatial structure by simple local raster transformation. The model has its own advantages in social human system modeling (Al-sharif and Pradhan 2014; Gharbia et al. 2016). In general, many other models create a spatial layout of present and past land surfaces using dynamic optimization techniques. Therefore, recently, the knowledge of LULC has become important for many planning and management activities. Effective analysis and monitoring of land cover need a considerable quantity of data concerning the earth surface and therefore the living habitats. Further, in this kind of studies, the analysis is aided by the mathematical and statistical advancement. Several techniques have been used to provide an estimation of LULC and one such interesting technique is MOLUSCE (Modules for Land Use Change Evaluation). MOLUSCE is designed to analyze various applications like, to study temporal LULC changes and simulate the future land use, to model the potential transition in land cover and forest cover, to identify the regions vulnerable to deforestation. This plugin incorporates the two well-known concepts to find the land use dynamics and they are the Markov chain analysis and cellular automata.

In Warangal, there is a significant unmonitored LULC change due to several reasons such as the illegal occupations and migration of people from rural to urban areas, due climate change. The aim of the study is to predict the LULC change from past to future and validate with the existing classified image (i.e., 2018). Markov chain analysis was used to detect the change of land use between two periods of time (2004 to 2006 and 2006 to 2018), and prediction of the future LULC for the year 2052 is undertaken by cellular automata, which is important to help urban planners in the process of decision-making.

Methodology

In order to achieve the goal of the study that involves the simulation of LULC change scenarios, satellite imagery from Landsat missions of the years 2004, 2006, and 2018 was used. A flowchart describing the basic components of the modeling in this study is presented in Fig. 1. The major steps include (1) database development and classification of images; (2) the application of the CA-Markov model to obtain the prediction of LULC in the years 2052; and (3) validation of the results.

Fig. 1
figure 1

Flow chart showing the methodology followed

Study area

Warangal Municipal Corporation (WMC) is located at 18° N and 79.58° E (Fig. 2) with a population of 746,594 as per 2011 census. The city has an average elevation of + 271 m above MSL. The area for the most part has a tendency to be dry without real changes in the temperature ranging from 34 to 42 °C. It gets very warm amid the mid-year periods of April, May, and June. The stormy season sets in the Warangal City with the start of south-west rainstorm in the later bit month of June and completes in the long extent of September with the finish of the south-west tempest. The city comes under Godavari river Basin. The city is sloping towards northern side and all the major nallahs flow towards north direction. There are a number of water bodies within the study area and two of the larger ones serve as summer storage tanks for drinking water supply. Out of the total municipal area of 80 km2, 25% of the total is contributed by canals, river banks, agricultural land, dense scrub, hills etc. with an area of 20.48 km2 and the major portion is occupied by residential area, i.e., 28.84 km2, contributing 37% of the total. The city economy of Warangal is predominantly based on agricultural in nature. Cotton is the major cash crop since the early 1990s. Paddy is the major food crop in the region but most of the farmers grow rice for both subsidence and commerce.

Fig. 2
figure 2

Location map of the study area

Data preparation

For evaluating and modeling of the dynamic LULC change, multi-temporal spatial data were collected. In this study, imagery from Landsat 5 TM and Landsat 8 satellite with a resolution of 30 m, for the years 2004, 2006, and 2018, were used. The detailed information of the data is shown in Table 1. All the spatial data are set to the same coordinate system, which is WGS_1984_UTM_Zone_40N. Raster images of the study area were subset as per the administrative boundary. To evaluate the accuracy of the interpreted satellite imagery, Google Earth and field investigation are considered as ground truth check. Taking into account the rate of socio-economic advancement and the periodic variation of LULC change, temporal range for the simulations cannot be short; hence, the time nodes as input were taken as 2004, 2006, and 2018 to predict the LULC pattern for the year 2052.

Table 1 Satellite imagery used in study

Classification of satellite images

At this stage, to extract the remotely sensed information, a man-machine interactive interpretation is carried out. Maximum likelihood (ML) algorithm, a supervised classification method that has a great potential in image processing because of its ability to simulate varying data, and satellite systems (Roy and Roy 2010), is used to classify the multi-temporal Landsat imagery downloaded. A supervised classification groups the pixels into classes that are individual training areas by comparing a set of representative pixels to the spectral properties of each pixel or use the sample specified by the user from the ground. To visually enhance the images, a principal component analysis was performed, to cut back redundancy with the images, prior to the classification. Polygons around the features are created as per the training samples collected and that areas represent each LULC type. Google Earth imagery, topographic maps, and personal knowledge of the study area helped in gathering the training samples. The raster images were pretreated in an open source software, Quantum GIS (QGIS) (http://www.qgis.org/en/site/forusers/download.html). Four LULC types were identified in every raster image for the years 2004, 2006, and 2018, which are shown in Fig. 3, including barren, built-up, vegetation, and waterbodies. To identify the class and group the LULC feature into a type, Anderson et al. (1976) developed a system of land use land cover classification given in USGS as shown in Table 2. However, to suit the study area, the classification system was slightly modified. Post-classification, a change detection technique was fit to deliver a matrix of change, limiting the impacts of sensor and climatic differences between the periods (Adade and Oppelt 2019).

Fig. 3
figure 3

LULC of the years a 2004, b 2006, and c 2018

Table 2 Description of LULC types

Markov chain for MOLUSCE model calibration and implementation for future LULC prediction

The Markov chain is a stochastic procedure, where, in a progression of events, any event that is about (Ghosh et al. 2017) to happen in the future depends just on the present state and, towards the end, frames a sort of a chain. Prediction in the future using Markov chain is commonly done by analyzing two LULC raster images classified for different dates, i.e., transition from one state of a system at time t + 1 to another state that is predicted from the state of system at time t. Three objects are created in view of this analysis and they are (1) transition probability matrix, which stores the probability that each state will change with respect to every other state; (2) transition area matrix stores the expected number of pixels that might change over a predetermined number of time units; and (3) conditional probability images express the likelihood that each pixel will have a place with the assigned class in the next time step (Ghosh et al. 2017). The homogeneous Markov chain model for calibration of land use at a specified time can be mathematically exhibited as follows:

$$ {L}_{\left(t+1\right)}={P}_{ij}\times {L}_{(t)} $$
$$ \mathrm{And}\kern0.5em {P}_{ij}=\left[\begin{array}{ccc}{p}_{11}& {p}_{12}& {p}_{13}\\ {}{p}_{21}& {p}_{22}& {p}_{23}\\ {}{p}_{31}& {p}_{32}& {p}_{33}\end{array}\right] $$

where L(t) represents the land use status at time t and L(t + 1) the land use status at time t + 1.

This study uses a scenario-bound approach to simulate future LULC changes following the model calibration. The simulations of LULC for different historical scenarios, 2004 to 2006 and 2006 to 2018, were given into the model to produce the transition matrix within the abovementioned 14-year period. The LULC images for 2004 and 2006 were used for calibration and optimization of Markov chain algorithm. The earliest year 2004 is used as time 1 and the latest year 2006 is used as time 2. The changes between two periods of time 1 and time 2 are used in the modeling of the LULC map at time 3 of the year 2018. The socio-economic evolution in the periods 2004–2018 led to a substantial improvement in the urban environment, which is considered as a required scenario. The classified raster of 2018 is taken as a reference map to simulate the projected 2018 map and validation is carried out by running 20 iterations.

Results and discussions

Correlation evaluation

Pearson’s correlation coefficient is a statistical proportion of the strength of a linear relationship between paired data. An estimation of 0 signifies no linear correlation; the closer the value is to 1 or − 1, the stronger the linear correlation. Table 3 below showed the correlation ratio between the four decadal time spans. It was noticed from the result that the years between 2006 and simulated 2018 are highly related to the other and then follow the 2004 and 2006 time span.

Table 3 Pearson’s correlation

LULC change characteristics of the study area

Socio-economic factors and traffic might play a vital role within the expansion, especially the access of different land types to the roads and administration centers. The LULC maps for 2004, 2006, and 2018 were created using a Semi-Automatic Classification Plugin in QGIS. The rate of temporal LULC change is in Table 4 (section a); from 2004 to 2006, the percentage change in LULC classes is 5.794 for barren, 2.154 for built-up, − 8.024 for vegetation, and 0.0749 for waterbodies, which indicates an increase in built-up type and decrease in vegetation. Table 4 section b shows the statistics for the period 2006 to 2018; the percentage change in LULC classes is − 11.908 for barren, 6.636 for built-up, 4.227 for vegetation, and 1.044 for waterbodies. The above statistics indicates that compared with the past, the rate of LULC change accelerated, wherein the barren land have decreased at a larger rate and increased in the built-up land signifying a rapid urbanization. This change has affected many socio-economic activities on the land use pattern. Likewise, Table 4 section c gives the estimated rate of change in LULC classes, compared for the period of 2006 to simulated 2018. The direction of built-up expansion is primarily along two directions: (1) the southeast direction of the Warangal, i.e., towards the Khilla Warangal and (2) in the northeast direction towards Gopalpur, alongside the Kakatiya canal. In addition, there is a potential expansion of agricultural land within the northeastern and southern regions of Warangal, along the highways and national roads.

Table 4 (a) LULC change statistics from 2004 to 2006. (b) LULC change statistics from 2006 to 2018. (c) LULC change statistics from 2006 to simulated 2018

Transitional potential modeling in the LULC change using artificial neural network

While many methods are available for computing transitional potential map, artificial neural network (ANN) and logistic regression (LR) are some of the computational intelligence elements available in this module. Every method uses LULC information as inputs for calibrating and modeling LULC change. The usage of this method is justified in solving problems wherein the algorithm deals with a large input data that is unknown or difficult to implement. Therefore, a continuous index is generated which characterizes the terrain in a range from 0 to 1. ANN implements the requirements of fuzzy logic and that is why a continuous range, e.g., 0 and 1, is determined based on the usability of terrain. The core element of ANN is the interactions between connected neurons and the modification of the weight connections between them. This modification depends on input data and the expected output from the network as well. This process is called “neural network learning” (Fig. 4). The result for this learning is a transition probability matrix that shows the transfer direction of land use types (see Tables 5, 6, and 7). From 2004 to 2006, vegetation, waterbodies, and barren land are the consistent classes, with 0.78, 0.76, and 0.76 probabilities, respectively. The most dynamic class is the built-up class with transition probability of 0.94. From 2006 to 2018, the transition of various land use types is consistent with the previous period; the most stable class is vegetation turning into built-up class with 0.84 probability. The most dynamic classes are barren and vegetation, with transition probabilities of 0.443 and 0.47 that were primarily transformed into waterbodies and built-up with an increase in probability of 0.60. When the periods of 2006 and simulated 2018 are considered, the transition was almost similar to the previous comparison.

Fig. 4
figure 4

Neural network learning curve

Table 5 Transition probability matrix of LULC classes from 2004 to 2006
Table 6 Transition probability matrix of LULC classes from 2006 to 2018
Table 7 Transition probability matrix of LULC classes from 2006 to simulated 2018

Predicting LULC change and analysis of simulation results using cellular automata

The memory from the ANN processed results is passed to cellular automata and three types of outputs maps are produced. Transition potential map shows the probability or potential to change from one land use/cover class to another. The values of the transition potential range from zero (low transition potential of change) 100 (high transition potential). From the corresponding LULC classes, transition potential maps will be produced (e.g., “vegetation to built-up” transition potential, “barren to waterbodies”). Another type of output is Certaincy raster that is the difference between two large potentials. The bigger the difference, the bigger is confidence; which is again based on the LULC classes and transitional potential of how the classes are certain to exist in the predicted raster. Transitional certainty of the land use types for the periods 2004 to 2018 and 2004 to 2052 is shown in Fig. 5a, b, respectively. In order to predict likely future trends in the study area, a cellular automata–based approach between 2004 and 2006 was used to forecast 2018 raster, as shown in Fig. 6a by considering the step size as 2 years with 6 iterations. The cellular automata approach is based on Monte Carlo algorithm. The simulated 2018 raster is validated with the real 2018 LULC classified raster to obtain kappa statistics. Once the agreement is obtained from the validation, LULC for the year 2052 is predicted following the same procedure and increase in iterations. Figure 6b shows the predicted LULC raster of 2052. The reasons for these changes are that the method is based on previous change in pixels according to 2018. Figure 7 shows the change in each land use for the year 2052; it is predicted that barren land will increase more and it will cover about 46.6% of the study area in 2052, while the rest of land use will change in different scales.

Fig. 5
figure 5

Certainty raster of the difference between the two most large potentials for the years 2018 and 2052

Fig. 6
figure 6

a Simulated LULC of 2018. b Predicted LULC of the year 2052

Fig. 7
figure 7

Representation of future LULC in 2052

Model validation

For a specific project, predicting an LULC is considered reliable only when it is validated with the existing datasets. Hence, using the validated module in MOLUSCE, two LULC rasters were validated; the first one is the real LULC of the year 2018, and the second raster for the year 2018 is predicted (simulated) using cellular automata. The same procedure was implemented in predicting 2052 LULC map, but, in this case, the iterations were given consecutively with respect to the 2004 and 2006 temporal difference. The validation module calculates four parameters of kappa statistics; overall kappa, kappa histogram, and kappa location and percentage of correctness, which are used to evaluate the accuracy of the model, are given below,

$$ K=\frac{P\ (A)-P(E)}{1-P(E)} $$
  1. 1.

    Kappa location

    $$ {K}_{\mathrm{loc}}=\frac{P(A)-P(E)}{P_{\mathrm{max}}-P(E)} $$
  2. 2.

    Kappa histogram

    $$ {K}_h=\frac{P_{\mathrm{max}}-P(E)}{1-P(E)} $$

where\( P(A)=\sum \limits_{i=1}^c{p}_{ii},={\sum}_{i=1}^c{p}_{iT}{p}_{Ti},{P}_{\mathrm{max}}=\sum \limits_{1=i}^c\min \left({p}_{iT}{p}_{Ti}\right) \), pij is the i,jth cell of contingency table, piT is the sum of all cells in the ith row, pTj is the sum of all cells in the jth column, and c is count of raster categories.

When compared with the real classified raster with the predicted raster of LULC, kappa statistics gives the amount of agreement that exists between the rasters and their probability. Koverall is employed to evaluate the overall success of the simulation and Klocation is used to evaluate the ability of the simulation to identify location. Kappa statistics with 0% indicates that there is no agreement while 100% indicates perfect agreement. Table 8 shows the result of the validated module to check the agreement between simulated 2018 raster and the real classified raster of 2018. The estimation of kappa (histogram) is 0.933, kappa (overall) is 0.653, and kappa (location) is 0.699 while the percentage of correctness is 76.229, which shows the consistency between the predicted 2018 LULC and the real 2018 LULC situation is good and the model is reliable for Warangal. Therefore, the prediction of the 2052 LULC raster is carried out by considering the step size as 2 years with 23 iterations in the cellular automata module. The validation statistics of the predicted 2052 raster is given in the Table 9. The estimation of kappa (histogram) is 0.911, kappa (overall) is 0.630, and kappa (location) is 0.692 while the percentage of correctness is 74.690.

Table 8 Kappa index of the model validation (simulated 2018 LULC with reference 2018)
Table 9 Kappa index of the model validation (simulated LULC of 2052)

Conclusions

This research aims to detect and evaluate the LULC change using QGIS especially MOLUSCE plugin, to predict likely the future using cellular automata-Markov chain–based simulation. In other words, the results indicate the capability of open source GIS (Quantum GIS) to run spatial analysis for spatiotemporal land use change study. Hence, it has a big potential for planning environment and adaptive management. Specifically, the application is useful for land use planners and decision-makers to monitor and assess their long-term development. To meet the research objective of the study, data is prepared in QGIS environment. LULC values of the images 2004, 2006, and 2018 were classified into four classes: barren, built-up, vegetation, and waterbodies. These images were used to project the likely changes in 2052. The outcomes of classification indicates a significant growth of some land use land cover mainly barren and vegetation during the period 2006 to 2018 with a stable decrease of other land cover type in the year 2052, especially waterbodies. The result of LULC change projection, which was carried out using CA, suggests that residential area and public building will continue to increase in Warangal and this will have an effect on the other land cover types such as agriculture and waterbodies. The distribution of population growth will increase due to the increase of residential area and focus of ministries and institutions in Warangal Municipal Corporation as it is in the list of smart cities, so people prefer to live near their place of work. In summary, the methods used here to answer the research questions and achieve the aims of the study have been very good. QGIS provided tools, which played a powerful role in studies of this type, which are very helpful for planner to address the problem, which arise due to environmental change and human activities. This study offers some recommendations to enable them to use land source in a better way; these are as follows.

  1. i.

    Planner should develop urban planning in the study area by providing short- and long-term planning since the uncontrolled growth of urban area has expanded and is distributed over the study area; this will affect land cover types.

  2. ii.

    Government should take responsibility for providing a master plan for the study area. This is required because the ratio of agriculture land has decreased. Green space should be increased rather than be allowed to transform to built-up area.