1 Introduction

The last decade, many regions of the world have rapidly urbanized and the world urban population has increased from 5% in 1900 to more than 54% nowadays. This percentage varies substantially in the levels of urbanization across countries. A United Nations report affirms that the urban future development and ultimate growth for world population are estimated to add 2.5 billion people to the urban population by 2050, reaching a percentage of 66%, with nearly 90% of the increase will be concentrated in Asia and Africa [1].

Urbanization is a complex process driven by a variety of spatial–temporal factors and usually hard to quantify and interpret [2]. The physical and morphological conditions, economic state, population growth, political situation, policies, and social behavior varies across regions that makes the pattern of urban growth unpredictable [3]. However, an accurate and updated information about the state of urbanization, the rate of urban expansion, and the patterns and extent of sprawl are needed by planners to provide for the services required by the urban population and help to manage complex urban development effectively [4]. Land use/cover (LULC) change has both direct and indirect impacts on environment as well as regional and global sustainable development, for the continuous evolution and transformation of land surface may result in a number of changes in environmental processes, such as soil erosion, surface run-off, pollution and carbon storage [5,6,7].

The simulation and prediction of urbanization can give input to various environmental and planning models [8]. In the last 3 decades, many types of simulation and prediction models include system dynamics model, GeoMod, SLEUTH model, multi-agent model and Markov model have been used within a GIS environment to determine future urban growth and LULC change. However, none of these models is perfect. GeoMod and SLEUTH models are unsatisfactory in efficiency, and they require the reliance from other auxiliary software. Markov models can quantitatively predict the dynamic changes in landscape patterns; however, they cannot resolve the spatial patterns of landscape change [9]. In contrast, cellular automata (CA) models can predict the spatial distribution of landscape patterns but cannot predict temporal changes [10]. For these reasons, researchers integrate different methods to characterize the dynamics of land use/cover, urban sprawl and so forth [11], for example IDRISI software developed by the Clark Labs at Clark University is one of the best platforms to conduct CA–Markov modelling, that integrates the functions of cellular automaton filter and Markov processes, using conversion tables and conditional probability of the conversion map to predict the states of land-use changes, and it may be better to carry out land-use change simulations [12]. The coupling of CA with Markov Chain Model provides a robust approach in spatial and temporal dynamics modeling of LULCC, because RS and GIS data can be efficiently incorporated [13, 14] and provide a more detailed information on a synoptic scale.

Temporal LULC and urban sprawl analyses with spatial metrics are useful to quantify spatial patterns of landscape dynamics [15] to understand the urban phenomena through attributes such Shannon Evenness Index (SHEI), Number of Patches (NP) The Normalized Landscape Shape Index (NLSI) and Multi-Layer Perceptron (MLP), etc. [16], which provides valuable insights to the inherent spatial structures over time with growth patterns [17].

El Jadida city is the second largest metropolis in the Casablanca Settat region after Casablanca city. El Jadida is well known as an international tourist city (Old Portuguese Medina) and pleasant natural landscape (forest and beaches). Moreover, it experienced rapid urbanization after 2000 because of its favorable location, thereby causing a rapid growth in population and economy. Considering its advantageous location, rapid urbanization, many natural resources, and international fame, El Jadida serves as an interesting and important case study. Accordingly, the objective of this study is to (1) monitor the changes in the past 19 years (2) simulate and predict the future land use change for years 2018 and 2040 based on CA–MARKOV model and remote sensing data, (3) perform fragmentation analysis, so as to enable development practitioners, planners, resource managers and policy makers effectively manage and tailor intervention for better sustainable development of the city.

2 Materials and methods

2.1 Study area

El Jadida city located on the Moroccan Atlantic coast, covers 58 km2 of land and is located between 33.18° to 33.14°N latitude and 8.34° to 8.27°W longitude at an average elevation of 22 m above mean sea level (Fig. 1). According to the 2014 census, the population of El Jadida increased from 66,296 in 2004 to 78,616 in 2014 with an urbanization rate of 49.7%. The overall population density of El Jadida has increased from 5430 inhabitants per km2 in 2004 to 6620 inhabitants per km2 in 2014, with a local per capita GDP of USD $2505 in 2017, which is well above the national average (USD $2402) (High Commission for Planning, 2014). The climate is of semi-arid type with an annual rainfall of 366 mm, a mean temperature of 18 °C and high atmospheric humidity. The study area characterized by a low slope with an average slope of 4°, is a part of the geological unit known as the Moroccan coastal Meseta (Sahel Doukkala). It contains sub-tubular sedimentary series from Mesozoic and Cenozoic era that are based on Paleozoic land pleated during the Hercynian orogeny [18]. El Jadida is considered as the second future industrial pole in Morocco, is well-known for its picturesque natural scenery (forest and beaches) and highly developed socio-economic activities (industry, tourism, agriculture and fisheries production), particularly those related to phosphate and the Jorf Lasfar harbor, which contributes with 23% of the national industrial production and 32% of Morocco’s total exports [19].

Fig. 1
figure 1

Map showing the study area

2.2 Data and preprocessing

The methodological framework used in this study is shown in Fig. 2. It includes data processing steps, the CA–Markov model developing, simulation and forecasting for urban sprawl, simulation accuracy verification and land consumption patterns in El Jadida Urban Agglomeration in 1999, 2006, 2010 and 2018 in GIS environment.

Fig. 2
figure 2

Hierarchical structure of the urban LCM model

In the present study, the time-series Landsat images downloaded from USGS (https://earthexplorer.usgs.gov/) were employed to create the LULC maps for built-up growth and land transformation monitoring and modelling. Four images were selected to overlap the 19 years span for detecting the temporal dynamics in the urban area. All these satellite images were acquired in dry season (summer and autumn) with a minimal cloud cover were considered. These images have 30 m resolution multispectral bands and 15 m resolution panchromatic band. The Landsat-7 ETM+ images in 2006 and 2010 were used for model calibration, and the Landsat-8 OLI image of 2018 was applied for model validation. All images were subjected to geometric correction, image enhancement and strip processing. The Landsat 30 m spatial resolution multispectral bands were fused with the 15 m panchromatic band using Gram–Schmidt fusion method, which can improve the spatial resolution of multispectral bands and retain the spectral information of source imagery [20]. Auxiliary and explanatory data obtained from El Jadida urban agency included a 5 m resolution digital elevation model (DEM), slope, location of the main public and commercial equipment’s, railway stations, main roads, census data (such as administrative boundaries and core area) and residential objects updated in 2018 were selected for their potential effects in promoting urbanization during the modelling phase (Table S1), these locations were used to produce distance maps employing the “Euclidean Distance” analysis in the ArcGIS® software package, version 10.3. All input data of LCM model were structured in order to have the same processing extent which is the limit of study area, the same coordinate system, a cell size of a 30 m (same as classification obtained from the Landsat satellite images). The same number of land cover classes should be used, the roads layer should be binary classified and the driving factors were normalized to [0–255]. Excluded areas are expressed in the form of a Boolean map.

2.3 Land cover classification and gradient analysis with spatial metrics

In order to observe and quantify the urban growth, the maximum likelihood supervised classification method was employed for classifying the Landsat images because of its simplicity and robustness. Three land-cover categories were classified, namely, built-up, vegetation and bare soil (Bareland) (Table S2). A total number of 165 (for 1999), 146 (for 2006), 171 (for 2010) and 151 (for 2018) training samples were collected for maximum likelihood supervised classification. To avoid any major misclassification, an accuracy of the classifications was assessed by comparing a set of sample points from the classified landcover maps with reference data, based on selective field checks in 2018 by GPS and historical images for 1999, 2006 and 2010 in Google Earth®, the overall classification accuracy and Kappa coefficient of the four periods LULC map was determined [20, 21]. The overall classification accuracy was 84%, 89%, 91% and 94%, whereas the Kappa coefficient was 0.79, 0.84, 0.91 and 0.93 for the year 1999, 2006, 2010 and 2018 respectively, that indicated that the simulation method was effective [22].

Urban gradient analysis with spatial metrics are helpful in quantifying spatial characteristics of the landscape and identifying the causal factors and locations experiencing various levels (sprawl, compact growth, etc.) of urbanization in response to the economic, social and political forces. The select spatial metrics given in Table 1 (with characteristics of each metrics) were used to analyses and understand the urban dynamics at different levels: patch, class and metrics. The Number of Patches (NP), the Normalized Landscape Shape Index (NLSI) and the SHANON Evenness Index (SHEI) were involved in the analysis as the indices of area, these metrics were calculated using the FRAGSTATS 4.2 software package, employing the eight-cell neighbor rule (consider all the eight adjacent cells, including the four orthogonal and four diagonal neighbors) in defining patch neighbors [23]. FRAGSTATS metrics were designed for ecosystem and landscape related studies [24] and have been widely used in different studies [25, 26]. In this study, a multiple ring buffer with a distance interval of 1 km from the city center of El Jadida (old city Portuguese) was prepared to deduce the zonal urban expansion along all directions viz., North (N), East (E), South (S), West (W), North-East (NE), South-East (SE), South-West (SW), North-West (NW).

Table 1 Spatial metrics

2.4 Urban growth forecasting and accuracy assessment

2.4.1 CA–Markov chain model

CA–Markov model is a combination of CA and Markov chain, which adds an element of spatial and the knowledge of likely spatial distribution of transitions to Markov chain analysis and has the capability to simulate changes and predict decadal variations using satellite images [27, 28]. The Markov model focuses on quantitatively predicting dynamic changes of land-use change between previous (t1) and later time (t2) periods by developing a transition probability matrix between them, but lacks skill at dealing with the spatial patterns of land-use change. The CA is a cellular entity and is based on proximity concept and has the ability to predict the transitions among any number of categories, which indicates that the regions which are closer to the existing areas of the same class are more probable to change to a different class, conditioned by Markov transition rule and adjacent neighbors. IDRISI software developed by the Clark Labs at Clark University is one of the best platforms to conduct CA–Markov modelling, and it was applied in this study. The CA–Markov model in IDRISI integrates the functions of CA filter and Markov processes, using conversion tables and conditional probability of the conversion map to predict the states of land-use changes, and it may be better to carry out land-use change simulations [29]. Carrying out CA–Markov modelling using IDRISI involves two techniques: Markov chain analysis and CA [30]. The transition probability matrix determines the likelihood that a cell or pixel will move from a land use category or class to every other category [31]. In this study, the land cover maps in 2006 and 2010 were selected to calculate the transition probability matrix using Multi-Layer Perceptron (MLP) from vegetation to built-up and from bare soil to built-up. The MLP constructs a network of neurons between two example classes and driving factors, together with a web of connections that consist of sets of weights. Then, the sample cells are divided into two groups. The first 50% of the sample cells are used for training and the second 50% for validation.

2.4.2 Potential driving factors

The expansion of urban area is generally related in search of better infrastructural facility. The forces and drivers of urban expansion are different in each region. Further, detecting and locating the driving factors that may be related to LULC change is a crucial step in modelling urban growth. In this study, the explanatory variables included in the transition probability matrix to simulate land use maps for 2018 and 2040 are: elevation, slope, distance to public and commercial equipment’s, distance to urban areas (residential objects), distance to roads and railway. These locations were used to produce distance maps employing the “Euclidean Distance” analysis in the ArcGIS® software package, version 10.3 (Fig. 3). A constraints layer was also introduced to prevent some areas in the region from becoming urban such as public parks. To test the potential power of explanatory variables, the Land change model (LCM) provides Cramer’s V correlation coefficient, which tests the relationship between variables and the distribution of land use types. After transformed the driving forces file to natural log the research made very important step that was test and select the driver variables based on the Cramer’s V factor. In general, the variables that have a Cramer’s V of about 0.15 or higher are useful while those with values of 0.4 or higher are good [32].

Fig. 3
figure 3

Urban growth contributing factors: a Elevation, b Slope, c Distance from roads, d Distance from urban areas, e Distance from public and commercial equipment’s, f Excluded areas (Constraints areas)

2.4.3 Calibration and validation

Calibration and validation are two critical processes for testing the effectiveness of the CA–Markov model. A clear distinction between calibration and validation is needed to make the modeling results credible [33, 34]. Quantifying the predictive power of the model consists in comparing the result of the simulation (2018) to a reference map (2018) using variations of Kappa [35, 36]: Kappa for location (Klocation) and Kappa for quantity (Kquantity). During calibration, the 2006 and 2010 land use maps were used to calculate the transition probability matrix. In addition, the 2010 land use map and the six potential driving factors were integrated into the transition probability matrix to simulate the 2018 land use map. For validation, the simulated 2018 map was cross-tabulated with the classified 2018 map. The predictive power of a model is considered strong when its efficiency is greater than or equal to 80%, then it is useful to make future projections (2040) assuming that the transition mechanism verified between 2010 and 2018 will be repeated.

3 Results and discussion

3.1 The spatial–temporal land use/land cover change

According to Fig. 4 and Table 2, The study area has witnessed increased urbanization and change in different LULC during the 1999–2018 period. LULC change maps show that the built-up area experienced the largest changes with a total increase of 33%, mainly at the expense of the vegetation class during the first and second periods, with a decrease of 11.0771 km2 and 4.2039 km2, while in the third period the change was at the expense of the bare soil class with a decrease of 18.8343 km2. This increase in built-up, probably took place due to migration of population towards the city, which offers better education activities, business and job opportunities. This is in conformity with many LULC studies conducted in Morocco [6, 37, 38] and other global studies [28, 39,40,41]. The urban growth in EL JADIDA, although it wasn’t steady in its growth rates due to the changes in the city policies, it was persistent. El Jadida in the first period, precisely in the late 90’s was satisfied with the urban dispersion that both the industrial area and the port of Jorf Lasfer has created away from the city center making a small annual growth rate of 25%, completely ignoring the raise of population. The cause and effect of this industrial success is the raising demand on public and administrative services and housing. By 2006, attempts to solve the problem started to immerse by implementing major investments that helped shake the urban dynamic of the city such as the hippodrome and Mohammed V hospital in the North East, the residential parks in the south and touristic parks in the North West, making the annual growth rate reach its peak with 1.62. The growth rate dropped back down in the period between 2010 and 2018 to 1.33. After the classification of satellite data, the reliability of results depends on the overall accuracies of the classified images. The result of this process indicates whether the LULC changes have been accurately identified and extracted. According to Anderson (1976) [42], approving the reliability of classified images is through estimation of overall accuracies. The overall accuracy should clearly exceed the minimum acceptable standard of ≥ 85% stipulated by the USGS classification scheme. The accuracy assessment showed an overall accuracy of 79% in 1999, 84% in 2006, 91% in 2010 and 93.4% in 2018. Misclassified pixels were mostly mixed pixels observed along the boundary between multiple land-cover types. These mixed pixels were inherent in medium spatial resolution images, such as Landsat images, and considered to be a main reason for classification errors [43].

Fig. 4
figure 4

Land use/cover change from 1999 to 2018

Table 2 Net change of each land cover category during 1999–2006, 2006–2010, and 2010–2018

3.2 Gradient analysis with spatial metrics

Dramatic LULC changes affect urban form through altering the patterns of the landscape. Landscape metrics such as NP, NLSI and SHEI are significant indicators for evaluating landscape attributes such as diversity, shape and fragmentation. The Fig. 5 represents the NP, NLSI, SHEI values per direction and per gradient during different periods. The number of patches (NP) metric explains the order of fragmentation or clumped growth in the built-up area calculated as patches.

Fig. 5
figure 5

Spatial metrics zone-wise for each gradient during all years: a Zonal and gradient division of the study area; b The SHEI per year and per direction; c The NP for SE direction; d The NP for the SW direction; e The NLSI for SE direction; e The NLSI for the SW direction

As observed, it is noticed that the decreasing NP in the core area for both directions in years between 1999 and 2006 is a sign of clump. The clump continued until 2006 for all of the first six circles in the SW. After 2006, the NP in the SE started to increase and spread reaching 5000, 6000 and 7000 m circles with high values ranging between [230–300] in 2018. The SW noticed a severe fragmentation in the 5000, 6000 and 7000 m circles reaching respectively 131, 159 and 168 in 2018, and have been only 29, 48 and 47 comparing to 2010. The NLSI started with high values in the SW and SE regions in 1999 reflecting a disaggregation especially within 7000 and 8000 m circles with values respectively 0.46, 0.78 in the SE and 0.58, 0.75 in the SW (value 1 completely disaggregated). Up to 2010, the NLSI had a decreasing trend which means that patches are going more and more towards a compact simplified shape [44]. In 2018, the values have noticed a slight re-increase especially in the 5000, 6000, 7000 m circles in the SE and in the last four outer circles of the SW, it is also noticed and as expected (based on the NP), that the inner circles are more aggregated than the outer ones. The values of SHEI computed on the landscape level for the study period started relatively high (Especially in the SE) and kept on increasing until reaching in 2018 values of 0.9079 in the SW and 0.9184 in the SE, which means that the built up grew toward more even distribution in both directions with slight more evenness on the SE side. The phenomena of increasing NP and SHEI, especially in the outside circles, is an indication of a spread [45] that can be partially due to the leapfrog urban growth type [46], while the clump and the decline of the shape complexity in the inner core area, can partially be attributed to the extension and infill urban growth type, similar to the case referenced by [47], where it has been explained that the built-up spatial pattern of Sancaktepe district, had by 2009, become contagious as new development tended to infill around existing development forming large contagious patches.

3.3 LULC forecasting and analysis in 2040

After detecting and highlighting the urban change, a set of potential driving factors maps were prepared to be integrated as sub models in order to run the MLP and generate transition potential maps. To test the degree of association between each potential driving factor and the changes, Cramer’s value test was first performed. The Tables 3 and 4 represent the V-Cramer score for the considered factors, the Cramer’s values shows that the distance to the urban area is the most influential factor for both sub models, while the slope factor has been eliminated since its value 0.0836 is low than the acceptance rate 0.15 [48]. However, and due to the limitations of this test in taking into account the intricacy of the relationship for values superior to 0.15, it was necessary to check the sensitivity analysis after running the MLP. Unlike what the V-Cramer has indicated, the sensitivity analysis shows that the most influential variable “from bare soil to built-up” is the distance from roads. The exclusion of this variable causes a drop of the overall accuracy from 79.25 to 71.7. As for “from vegetation to build” variable, the sensitivity test confirmed the ranking of the distance to urban area as the most influential factor, it was also possible to detect, that some factors are negatively affecting the accuracy of the sub model. The exclusion of the elevation factor for example, would increase the overall accuracy by 0.10. A several MLP runs were conducted in which variables affecting the overall accuracy of the first sub model were excluded only two final explaining factors were kept.

Table 3 V-Crammer’s test of potential driving factors
Table 4 Adopted driving factors and MLP transition potential accuracy

In order to verify the model performance, and based, on (1) the land use data for 2006, 2010, 2018; (2) the transition potential maps and (3) the Markov transition probability matrix; a simulation of the year 2018 was performed and compared to the real LULC of 2018 by the mean of the KAPPA index. The resulting statistics shows that Klocation value is 0.7902. The validation process has shown a successful prediction of LULC map for 2018. The accuracy rate is acceptable and the performance of LCM to identify grid cell level location of future change is satisfactory. As a consequence, we can predict future urban sprawl in 2040 (Fig. 6). The built-up area in 2040 will increase from 29.9907 to 43.81740 km2 between 2018 and 2040, meaning a total increase of 13.8267 km2 in 22 years with an average annual growth rate of 0.6284.

Fig. 6
figure 6

Land use/land cover map by years a 2018 actual, b 2018 simulated and c 2040 simulated

The expansion in 2040 seems to be a succession to the previous years characterized by the linear development type along the main roads, extension type from the urban of 2018 and some infill urban type as well. When it comes to spatial metrics, SHEI is forecasted to reach its highest levels of evenness in the SW region by 2040 with a value of 0.9203 while in the SE the value has dropped to 0.8849 indicating the start of the dominance of the built-up class. The number of patches (Fig. 7) has also noticed a major decrease in both directions with the majority of the five circles reaching a full clump (NP = 1). The NLSI values of 2040 on the circles level for both directions are low, ranging only between 0 and 0.1 which mean that the region patches grew into a compact simple shape.

Fig. 7
figure 7

The NP of 2040 per direction and per circle; the NLSI per of 2040 per direction and per circle

4 Conclusion

This study used a combined approach of remote sensing, GIS and statistical models to reveal, quantify and predict the urban growth in El Jadida city. To conduct the study, a maximum like hood classification of the LANDSAT satellite images of years 1999, 2006, 2010 and 2018 was first performed, to produce the LULC maps. Second, and in order to quantify the urban landscape pattern in terms of diversity, shape complexity and fragmentation, SHEI, NLSI and NP indices were computed using the FRAGSTAT software according to a zonal division. For the prediction of the future urban growth, we opted for the LCM model, since it integrates the MLP, the CA model and the Markov Chains, and it is able to process heterogeneous data. The accuracy of predicted LULC map of 2018 was validated using Klocation index (the difference between the actual built up area and the simulated one is 0.9 km2). For better understanding of the predicted result we computed the spatial metrics used previously in the study for the built-up class of 2040. This study can contribute in helping the local and regional planners to have insight on the future urban growth and therefore provide better adopted management policies that may lead the city toward a sustainable development.