Introduction

Urbanization is a global multidimensional phenomenon demonstrated by abruptly changing human population densities and changing land cover (Miller and Hutchins 2017). It is one of the primary causes of land cover changes in cities compared to rural areas. Urbanization occurs by converting vegetation and bare soil into built-up areas to improve economies and social standards (Ziaul and Pal 2018). Urban development strategies are executed in metropolitan areas as a reaction to a rapid increase in populace and lack of land area (Hu et al. 2020). Global urbanization exhibits an uplifting momentum of augmented development and displays the surpassing of the urban population from the rural population (Chen et al. 2014). The world urbanization projections by United Nations revealed that only 30% of the population lived in cities in 1950 and 55% in 2018; nonetheless, it is anticipated to reach 68% by 2050 (United Nations 2018). In this context, land use land cover (LULC) (Aouissi et al. 2021) and biodiversity have tremendously changed under urban climates due to urban sprawl and oversaturation, primarily in regions with concentrated consumption and production systems (Li et al. 2022).

Urban development induces major environmental problems in metropolitan areas, such as air pollution (Wang et al. 2020), urban flooding (Gaagai et al. 2022; Farhadi and Najafzadeh 2021), pandemics (Leveau et al. 2022), environmental unsustainability (Liu and Jiang 2021), and waste management issues (Kebaili et al. 2022). Several researchers have investigated urbanization impacts on the environment (Kazazi et al. 2022), biodiversity (Aouissi et al. 2017), food systems (de Bruin et al. 2021), and land surface temperature (Govind and Ramesh 2019). In this regard, sustainability assessment under urban environments plays a vital role in decision-making related to construction activities, disaster management, and environmental standard improvement (Bhatta 2010).

LULC changes are responsible for significant variations in the overall natural landscape (Mannan et al. 2021; Shah et al. 2021). A vast proportion of LULC change has been caused by the rapid urban sprawl, which has emerged as a key factor in natural resources management and protection. Urban sprawl can be depicted extensively and thoroughly as a demographic method, including socioeconomic and LULC changes in a region of interest (Kuang et al. 2020). Moreover, an evident relationship has been observed between changes in LULC and population increase caused by excessive migration to urban regions. Hence, substantial growth mainly drives a massive conversion of land use from arable lands into built-up areas, vegetation reduction, and water scarcity (Reiner et al. 2015). In under-developed countries, conventional surveying and mapping approaches are costly and time-inefficient, while high-quality data are not easily accessible. Consequently, Geographic Information System (GIS) and remote sensing tools have widely emerged to gain momentum in urbanization monitoring (Liu and Jiang 2021). These tools are advantageous, particularly in providing historically reliable and coherent information, making them significant for implementing Sustainable Development Goals (SDGs), as Christensen and Jokar Arsanjani (2020) suggested.

Remote sensing systems provide reliable information, such as data derived from Landsat sensors, for detecting changes in urbanization patterns. Since the 1980s, multispectral Landsat sensors, i.e., Landsat 1, 2, and 3 Multispectral Scanner (MSS), Landsat 4 and 5 Thematic Mapper (TM), and Landsat 7 Enhanced Thematic Mapper Plus (ETM +), and Landsat 8 Operational Land Imager (OLI), have produced data with a moderate resolution that are freely available (Zhu et al. 2019). Even though high-resolution satellite products (< 10 m) have progressed exponentially over the recent 20 years, moderate-resolution data have dominated earth observation studies in light of their enormous temporal coverage for over 50 years (Liu and Jiang. 2021).

For instance, Landsat data were widely used for urbanization patterns mapping in India between 1973 and 2010, China (1999–2016), Bangladesh (1975–2003), and Pakistan (1990–2020) through the integration of remote sensing and GIS-based methodologies (Dewan and Yamaguchi 2009; Moghadam and Helbich 2013; Mannan et al. 2021; Li et al. 2022). In this regard, various classification methods using satellite data, including supervised and unsupervised learning, principal component analysis (PCA), fuzzy logic, and hybrid classification (Butt et al. 2015), have been considered by many researchers. Nevertheless, supervised learning has been given much attention due to its ability to be controlled by the user. Therefore, this paper proposes a remote sensing and GIS approach using open-source tools to assess urbanization patterns in Islamabad. The objectives include (1) delineating and examining land use land cover (LULC) changes over Islamabad city during the past 42 years (1979–2020) using Landsat data and (2) investigating demographic growth impact on urbanization.

Materials and methods

Study area

Islamabad is located in Northern Pakistan between 33° 49′ north latitude and 72° 24′ east longitude. The city occupies an area of 906 km2 with an altitude of 457 to 610 m (Butt et al. 2015). Islamabad is divided into five administrative zones comprising diverse land cover types and geographic features, as shown in Fig. 1. These five administrative zones are categorized into sectors from A to I, with each sector being divided from 1 to 18 (Javaid and Waheed 2021). Zone I was designated for administrative land use and commercial housing properties, Zone II was assigned the private sector properties, and Zone III was characterized by mountains, forests, and piedmonts comprising Margalla Hills' National Park (Aslam et al. 2021). Agricultural landscapes and rural areas mainly characterize Zone IV, whereas the southern side of Zone V, near Rawalpindi city, consists of private houses and industrial facilities (Maria and Imran 2006). At 900-m altitude, the tropical evergreen broadleaf forest is the dominant natural vegetation, along with deciduous broadleaf and subtropical evergreen coniferous forests above this altitude (Liu and Jiang 2021). Islamabad has a humid subtropical climate, as indicated by the Koppen climate classification system. The city experiences hot summers along with monsoon rain in July and August with an average temperature of 38 °C (Celsius) and cold winters alongside occasional snowfall over the hills with an average temperature of 4 °C (Shah et al. 2021). Due to the capital’s high living standards, employment opportunities, and education quality, the city has swiftly grown in size. The continuous demographic growth posed severe environmental challenges like air pollution and solid waste mismanagement.

Fig. 1
figure 1

The study area location and its principal Capital Development Authority (CDA) zones

Data acquisition and preparation

As Landsat satellite imagery offers valuable support for analyzing urban extended areas due to its cost-effectiveness, medium spatial resolution, and maximum temporal monitoring period (Shah et al. 2021), they have been used in this research. Landsat data for the last 42 years were collected with path 161 and row 37 and provided by the US Geological Survey (USGS). Used data specifications are listed in Table 1.

Table 1 Main data source and properties

Since Landsat data are regularly updated and freely accessible through the Earth Explorer repository, they are a reliable and efficient source of information for urbanization patterns mapping (Butt et al. 2015). Therefore, five Landsat products (Level 1 scene-based products) characterized by zero cloud coverage were obtained for 1979, 1990, 2000, 2010, and 2020. Islamabad’s administrative boundary vector data were collected from DIVA-GIS (https://www.diva-gis.org/). Since satellite data were acquired by different Landsat sensors, i.e., MSS, TM, and OLI, a projection to the Universal Transverse Mercator (UTM) coordinate system using World Geodetic System (WGS) 1984 datum assigned to UTM Zone 43 N and resampling to a standard spatial resolution of 30 m were performed to produce a homogenous time-series dataset. Layer stacking and sub-setting based on the area of interest (AOI) extent were also applied using the Quantum GIS (QGIS) software.

Remotely sensed data processing and LULC classification

Remotely sensed data processing enhances spectral information quality by removing spectral noise caused by atmospheric and radiometric distortions to produce a direct link between real-world conditions and observed data (Butt et al. 2015). Data processing steps such as extraction, rectification (Mohammed and Ali. 2014), georeferencing (Eugenio and Marqués 2003), and classification (Congedo 2013) have been applied to five satellite images (1979, 1990, 2000, 2010, and 2020). LULC interpretation is performed by classifying the study area into four major classes, i.e., built-up, vegetation, bare soil, and water. A detailed explanation of LULC classes is provided in Table 2. Band combination (NIR-R-G) was used in Landsat 5 and Landsat 8 data to visually detect the difference between LULC classes, as suggested by Frutuoso et al. (2021). Then, training samples for each class were selected through a meticulous visual interpretation based on Google Earth historical images. A total of 200 training samples were collected, with 50 samples for each class through satellite data by demarcating polygons around the corresponding class. The training samples were refined, merged, removed, and renamed after a rigorous assessment of statistical parameters for each class.

Table 2 Description of land cover classes used for supervised classification

Training samples are used for the spectral classification of each satellite image based on supervised classification. As it is considered a knowledge-based expert system for feature extraction, supervised learning is primarily controlled by the analyst, who selects corresponding pixels for each class (Boori et al. 2015). Many supervised learning algorithms have been used for classification methods like random forest (RF) (Khan and Sudheer 2022), maximum likelihood classification (MLC) (Nkwunonwo 2013), and minimum distance algorithm (MDA) (Rojas et al. 2020) in similar studies. In the current study, MLC has been selected for its efficiency in LULC classification by producing negligible probability errors, as Faisal Koko et al. (2021) recommended. The detailed methodological workflow is presented in Fig. 2.

Fig. 2
figure 2

Workflow used in this research for urbanization patterns mapping

Accuracy assessment

It is fundamental to perform an accuracy assessment test to validate the information produced for LULC change analysis (Shah et al. 2021). The accuracy assessment step aims to measure the reliability and quality of LULC classification based on remotely sensed data. The test was performed through stratified random sampling of 50 samples from each classified imagery and then comparing the classified image with the reference image. A confusion matrix was then prepared for each classified image; on its basis, the possible parameters depicting LULC classification accuracies, such as the overall accuracy and the kappa coefficient, were determined. The overall accuracy is mainly the agreement between the reference and classified data, whereas the kappa coefficient determines the total error of classification and the conformity level among the reference and classified maps (Wiatkowska et al. 2021). The kappa coefficient value ranges from 0 to 1, with a value greater than 0.8 showing excellent accuracy, from 0.4 to 0.8 demonstrating moderate accuracy, and lower than 0.4 revealing low accuracy (Mawenda et al. 2020).

Analysis of built-up areas in relationship with demographic growth

In order to determine the variation of urbanization patterns over time and space, the built-up area class was solely extracted to represent urbanization from 1979 to 2020. Its spatiotemporal extent was compared with the population growth data published by United Nations (UN) and the World Population Prospects (United Nations 2019). The 26th edition of the population projections and estimations by the UN was considered since it includes population estimation from 1950 to 2019 with projections until 2100 for 235 nations (Cohen 2006). These estimations reflect a scope of scenarios at different national, regional, and global levels. These datasets provide a timeless reference for observing global progress toward SDGs by 2030 (Vollset et al. 2020). Once population estimations were acquired for 1979, 1990, 2000, 2010, and 2020, the relationship between population and built-up areas extent was examined to comprehend the urbanization patterns in Islamabad over the past four decades.

Results and discussion

Assessment of LULC changes in Islamabad between 1979 and 2020

As remote sensing and GIS are reliable tools found to be effective in studies related to urbanization monitoring in metropolitan cities (Olorunfemi et al. 2020), the spatiotemporal patterns are determined via a LULC classification-based approach to defining socioeconomic growth and environmental change in a well-defined urban area (Xu et al. 2019). In the current study, remote sensing datasets were used to detect LULC changes in Islamabad between 1979 and 2020. Based on Fig. 3, the used approach produced five maps categorizing LULC classes, i.e., water bodies, vegetation, built-up area, and bare soil for each year. Waterbodies (blue) demonstrated a decrease from 23.43 km2 in 1979 to 8.79 km2 in 1990, and increased afterward to 9.28 km2 in 2000, with a further increase to 10.25 km2 in 2010, but decreased later to 6.93 km2 in 2020. However, water bodies significantly decreased from 23.43 km2 in 1979 to 6.93 km2 in 2020, a remarkable reduction of 16.50 km2. Vegetation (green) showed irregular changes in the last 4 decades with an increase from 184.35 km2 in 1979 to 212.8 km2 in 1990, decreasing to 185.3 km2 in 2000, increasing again to 240.8km2 in 2010, and decreasing afterward to 136.0 km2 in 2020. This land area covered with vegetation has diminished from 184.35 km2 in 1979 to 136.07 km2 in 2020, with a loss of 48.29 km2 in total vegetation cover. Built-up area (red) observed a regular increasing pattern from 41.31 km2 in 1979, 52.8 km2 in 1990, 74.59 km2 in 2000, 142.8 km2 in 2010, and 152.52 km2 in 2020. The highest increase in built-up areas was experienced from 2000 (74.59 km2) to 2010 (142.8 km2), showing an overall increase of 111.20 km2 from 1979–2020, as illustrated in Fig. 4 and Table 3. Furthermore, the bare soil area (beige) showed irregular variations from 596.25 km2 in 1979 to 571.43 km2 in 1990 to 576.1 km2 in 2000 to 451.3 km2 in 2010, and to 549.8 km2 in 2020. Overall, the bare soil area decreased from 596.25 to 549.89 km2 in 42 years showing a reduction of 46.3 km2. The results showed that Islamabad had experienced a massive increase in built-up areas during the past four decades (1979–2020), which agrees with the studies of Butt et al. (2015) and Bokhari et al. (2022). This expansion in built-up areas can be explained by a rise in real estate and housing organizations working within Islamabad (Maria and Imran 2006). The city has encountered a massive migration of individuals from rural areas to metropolitan regions for several purposes, such as career development, education, and entrepreneurship, as suggested by Mannan et al. (2021). Government strategies for infrastructure improvement and industrial development have tremendously promoted the real estate industry, leading to accelerated urbanization (Hassan et al. 2016). Similar findings were observed by Shah et al. (2021), Liu and Jiang (2021), and Hassan et al. (2016), proving the urban area sprawl and its impacts in Islamabad for 1979–2019, 1990–2018, and 1992–2012 time frames.

Fig. 3
figure 3

Spatio-temporal changes in LULC for a 1979, b 1990, c 2000, d 2010, and e 2020

Fig. 4
figure 4

Comparison of land use land cover changes for four major land classes between 1979 and 2020

Table 3 LULC classes between 1979 and 2020 in surface area and percentage

LULC classification maps were overlayed to derive a change map, as shown in Fig. 5. Changes in each class in terms of area (km2) are given in Table 4. The minimum change was determined from built-up to waterbodies (0.01 km2), vegetation to waterbodies (0.17 km2), and bare soil to waterbodies (0.97 km2). A significant change was observed during the last four decades: from vegetation to bare soil (80.84 km2), and from bare soil to built-up areas (120.32 km2). The most significant LULC change was registered from bare soil to built-up areas, as shown in Fig. 5 and Table 4. The bare soil to the built-up area (pink) is shown in Fig. 5 with a maximum change at the western and southern sides of Islamabad city. These changes can be explained by the developmental procedures in industry, residency, trade, traffic infrastructure, demographic growth, and administrative facilities (Butt et al. 2015). Similar results were found by Hassan et al. (2016), showing the maximum conversion from bare soil to the built-up area. Urbanization is expanding in a consistent manner through activities like infrastructure development, parking garages, and street buildings (Kazazi et al. 2022). The landform has subsequently changed from its previous state, i.e., a natural environment to urban areas which is an irreversible transformation. If the expansion continues in the same way, it may lead to habitat destruction, agricultural loss, and environmental degradation (Javaid and Waheed 2021). Commercialization and industrialization in Islamabad are adding more to urbanization growth that resulted in the migration of massive people during the recent decades (Liu and Jiang 2021).

Fig. 5
figure 5

Major land use land cover changes in Islamabad from 1979 to 2020

Table 4 Major LULC changes from 1979 to 2020

Accuracy assessment of LULC classification in Islamabad between 1979 and 2020

The most significant component in the LULC classification is determining the quality and efficiency of the quantitative information extracted from remotely sensed data products. This knowledge about the LULC classification quality is achieved by the accuracy assessment performed through overall accuracy and kappa statistics. In the present study, the overall accuracy for LULC classification achieved for 1979, 1990, 2000, 2010, and 2020 was 89%, 88%, 93%, 87%, and 89%, respectively. Moreover, the kappa indexes for LULC classifications for 1979, 1990, 2000, 2010, and 2020 were 0.85, 0.84, 0.91, 0.83, and 0.85, respectively. The overall accuracy is above 85%, with a kappa index value higher than 0.8, revealing a strong agreement with reference data, hence the reliability of produced maps (Mawenda et al. 2020). The results show a high accuracy for all LULC maps, with the outperformance of the 2000-map that yielded the highest values, as shown in Table 5.

Table 5 Accuracy assessment of supervised classification over time

Relationship between urbanization and demographic growth in Islamabad between 1971 and 2020

LULC change monitoring and population dynamics assessment are fundamental for decision-making and land management planning to guarantee sustainability in social, economic, and environmental aspects. These challenges conform to the UN SDGs and the 2030 agenda for sustainable development, as Christensen and Jokar Arsanjani (2020) suggested. Based on Fig. 6, a comparison between built-up areas and population size over time revealed a correlation trend with a population increase from 168,745 in 1979 to 1,129,198 in 2020. This led to an increase in built-up areas from 41.31 km2 in 1979 to 152.5 km2 in 2020, mainly caused by infrastructure and settlement expansion. Al-Sharif et al. (2013) have demonstrated that increased population density affects the urbanization rate through an increase in built-up areas; hence, a causal relationship exists between both factors. Islamabad has encountered a rapid improvement in industrial, agricultural, urban, and commercial sectors over the last four decades (Bokhari et al. 2022). Furthermore, demographic growth directly influences economic progress as it facilitates the accessibility of resources corresponding to economic, social, and ecological stability (Mannan et al. 2021).

Fig. 6
figure 6

Built-up areas in km2 versus population growth in Islamabad for the last 42 years

The limitation of the current study is the lack of reference data from a more reliable sampling frame for training sample selection for each LULC class. Although a detailed visual interpretation based on Google Earth historical images was used to randomly extract training samples, this might have affected the overall accuracy. This approach is justifiable due to the scarcity of reference data for validation.

Conclusions

The study highlights the main changes in LULC in Islamabad city and examines its association between demographic growth and urbanization. The time scope of this research is sufficient to observe urbanization patterns. LULC classification was performed using an open-source dataset and GIS tool, i.e., Landsat data and QGIS, for the last four decades (1979–2020). The Maximum likelihood algorithm was used to classify Landsat data based on four LULC classes, i.e., waterbodies, vegetation, built-up area, and bare soil. As remotely sensed data were integrated with population data to analyze the trend between the urbanization process and demographic growth over the last 42 years, the obtained results led to the following conclusions:

  1. 1.

    The built-up area in Islamabad has shown a massive increase from 1979 to 2020 by 111.20 km2, whereas vegetation, waterbodies, and bare soil classes have significantly decreased during the same time frame.

  2. 2.

    The dynamics of LULC change showed a tremendous decrease in bare soil in favor of built-up areas, with a value of 120.32 km2. Hence, a remarkable urbanization phenomenon was attributed to the development procedures in Islamabad.

  3. 3.

    Based on the accuracy assessment test, LCLU maps showed a relatively high overall accuracy and kappa index, with the 2000-map yielding the highest values of 93% and 0.91.

  4. 4.

    Over the last 42 years, a rise in population density was observed from 168,745 to 1,129,198, revealing its significant impact on urbanization expansion through the installation of new infrastructure and settlement construction.

  5. 5.

    There is an urgent need for an integrated approach to sustainable development and urban planning in Islamabad according to the spatial and temporal trends of built-up areas and population growth.

  6. 6.

    Future work will focus on implementing factors like land surface temperature, precipitation, humidity, and carbon dioxide (CO2) concentration to examine their connection with urbanization patterns in Islamabad. In addition, NUA (New Urban Agenda) indicator 15 can be used to identify the relationship between population and built-up areas.