Keywords

1 Introduction

As a vital component of Earth ecosystem, land cover plays an important role in various ranges including climate and hydrological modelling, biogeochemical cycling studies, environmental protection, biodiversity conservation, resource management and the complex interactions between human activities and global changes [1,2,3, 37]. Land cover products, therefore, have become an indispensable input in Earth sciences’ applications such as climate modelling [4,5,6], greenhouse gases emissions [7], carbon cycle [8,9,10], biomass modelling [11,12,13], hydrological modelling [14], land erosion [15], biodiversity [16], and disaster [17].

Changing in land cover is one of the most important factors of environmental variability. From the time of industrial revolution, natural resources exploitation for development caused enormous changes in the Earth surface and its environment [18]. Consequently, anthropogenic modifications of land cover had feedback impacts on our ecosystems, especially in biodiversity loss, carbon emission increase, surface albedo change and their implications such as climate change. Newbold (2015) [16] found out that 8.1% species richness on average globally has been reduced because of land use change and the richness is projected to fall by a further 3.4% globally by 2100. On the other hand, Houghton (2010) [9] and Shevliakova et al. (2009) [10] revealed that the land use and land cover change caused a gradually increasing trend in carbon emissions, from ∼0.6 PgC yr−1 in 1850 to ∼1.3 PgC yr−1 in the period 1950–2005. The fifth report of Intergovernmental Panel on Climate Change [19] showed that surface albedo variability caused by land cover, land use change contributed a negative radiative forcing of −0.15 Wm−2. Likewise, change in albedo could lead to change in surface energy, moisture budget, atmospheric circulation and temperature.

Global land cover map production is presently facing two main challenges which are coarse spatial resolution and disagreement between the products. Validation works demonstrated that among the original global land cover data set, MODIS show highest accuracy at 78.3% [1], then the accuracy of GLC2000, GlobCover and IGBP-DISCover were 68.6% [20], 67.5% [21], 66.9% [22] respectively. The highest resolution (30 m) land cover map named FROM-GLC reached only 64.9% of accuracy [3].

Northern Vietnam has experienced large modification in land use/land cover over the last decade due to forest plantation, transformation of land use purposes, building infrastructure and urbanization. Vietnam’s Country report on Forest Resources Assessment of Food and Agriculture Organization of the United Nations (FAO) (2015) [23] documented that forest are in Vietnam increased 1.696 million hectares (from 13.077 mil hectares in 2005 to 14.773 mil hectares in 2015). Other researches gave the proofs that the main cause of the forest gain was forest and rubber plantation in the North and the Central of Vietnam [24, 25]. On the other hand, the construction of the largest hydropower plant in Southeast Asia caused 7670 ha agricultural land and 3170 ha forest to be under water level in the basin [26].

Previous researches on land cover mapping in the study area mainly used multi-temporal medium resolution data from MODIS [27] or ADEOS II – GLI [28]. However, it is difficult to analyze the changes of land use/land cover in such highly heterogeneous Northern Vietnam with 250 m resolution data like MODIS or ADEOS II – GLI. It is thus of an essential need to produce land cover maps with higher accuracy and resolution.

The purpose of this study is to produce 15-meter resolution and high accuracy land cover maps in over Northern Vietnam in 2007 and 2015 using multi-temporal and multi-sensor data. The change between the two years was then analyzed to see the land cover dynamics in the recent decade, focusing on forest gain and the land cover changes in the Son La hydropower reservoir area.

2 Methods

2.1 Study Area

The study area is Northern Vietnam, located in the range from 10208ʹ41ʺE to 10804ʹ12ʺE of longitude and from 19017ʹ15ʺN to 23023ʹ35ʺN of latitude (Fig. 1a). The study area covers about 127,200 km2 with a variety of landscapes such as delta, hill lands, mountainous area, mountainous highland, coastal zone. Topography is characterized by the Red River Delta and other small coastal deltas in the south-eastern and high mountain ranges in the north and north-western. According to Köppen climate classification, the entire territory of Northern Vietnam belongs to humid subtropical climate strata with hot and humid summers and cold winters. Such seasonal variation of climate performs essential control to agricultural calendar over the region. The Red River Delta is the second largest delta in Vietnam, which covers about 15,000 km2, in which about 11,000 km2 is recorded as paddy (https://www.gso.gov.vn). This region also includes remarkable orchard areas in intensively cultivated land allocating mainly in hill lands in the north of Red River Delta and within Red River Delta. The dominant forest type in terms of biophysical aspect is evergreen broadleaf forest while bamboo, deciduous and needleleaf vegetation were recorded in a small percentage.

Fig. 1.
figure 1

(a) Study area (b) Distribution of field survey reference data

2.2 Data Used

The input data includes multi-temporal and multi-sensor satellite imagery (Table 1). All multi-temporal data were selected within one-year period which is either during 2007 or during 2015. We used Surface Reflectance products of 140 Landsat-5 TM scenes in 2007 and 199 Landsat-8 OLI scenes in 2015 available at United States Geological Survey (USGS) EarthExplorer. Since basic atmospheric corrections and geometric corrections are already included in this high level product, the Landsat images are ready to use after cloud removal. ASTER data including 55 images in 2007 and 68 images in 2015 were retrieved freely at MADAS system from the National Institute of Advanced Industrial Science and Technology (AIST), Japan. The source provides ASTER-VA product which already included ortho-rectification processing. PALSAR mosaic (2007) and PALSAR-2 mosaic (2015) were obtained from JAXA Earth Observation Research Center (EORC). Synthetic Aperture Radar (SAR) imagery’s characteristics such as cloud-free, sensitive to moisture of surface and multi-backscattering mechanisms facilitate to exploit more information, which can hardly be observed in optical data. After estimating the backscattering coefficient from original data, slope effect areas were masked using the mask files included in downloaded data. Then the gaps created from the masking process were filled using the nearest neighbor technique.

Table 1. Characteristics of data used

For ancillary data, we used SRTM30 data to calculate slope in order to making use of topography information. Other ancillary data such as 500 m night-time light SuomiNPP and OpenStreetMap were employed to enhance the detection capability by using presence of human activities’ information derived from night-time light and road network. SuomiNPP images acquired from National Oceanic and Atmospheric Administration (NOAA) were averaged to make a representative composite image and reduce cloud contamination. OpenStreetMap (OSM) road network were used to create a rasterized distance map to road network using gdal_proximity.py function in GDAL library. Pixel value of the distance map presents the shortest distance to road network. For geometric system consistency purpose, all the input data was converted to the Geographic Coordinates System with WGS84 datum (EPSG 4236). The preprocessing was carried out using Geographic Resources Analysis Support System (GRASS) GIS version 7.0 and QGIS version 2.14 and Geospatial Data Abstraction Library (GDAL).

2.3 Training and Validation Data

Reference data for training and validation was collected through field survey as well as visual interpretation using Google Earth. Field survey was implemented in February 2015 and February 2016 over Northern Vietnam. We used high accuracy GPS integrated cameras Casio H20G to collect photographs of land cover objects including barren, water, forest, grasslands, orchard, other crop, rice and urban (Fig. 2). The homogenous area of those objects must be more than 15 m in diameter. Total length of field route reached about 2000 km and a total of 1801 GPS photos was collected (Fig. 1b). Each GPS photo was then extracted all needed information about the field observation including geolocation, land cover category, observation time, homogenous degree of targeted objects and the photo itself. This process was carried out using a web-based reference dataset named “Site-based dataset for Assessment of Changing Land cover by JAXA (SACLAJ)”.

Fig. 2.
figure 2

Land cover types’ photos from the field

Regional scale of study area requires larger number and denser spatial distribution of reference data. Therefore, we conducted an image-based visual interpretation using Google Earth in combination with ASTER in 2007, Landsat-5 images for 2007 and ASTER in 2015, Landsat-8 for 2015. A number of 60,000 reference data was created in addition. We used 66% of the total number 62000 samples for training data and 34% of those for validation.

2.4 Classification Method

This study employed a kernel-based probabilistic classification which is based on Bayesian inference. We adopted this classification method from Hashimoto et al.’s research and the mathematics details were clearly described in [29,30,31]. Hashimoto et al. (2013) indicated that kernel-based probabilistic classification showed the highest overall accuracy in comparison to maximum likelihood classification and support vector machines when conducting land cover classification in Japan territory [30]. The core of this method is that using a probabilistic model to simulate a distribution of observable data, given some hidden parameters, like a generative model. Specifically, this method builds a generative model based on training data using kernel density estimation in order to calculate probability for each feature vector. The joint probability is then calculated by multiplying multi-temporal and multi-sensor posterior probability of each land cover class. After that, joint probability distribution is normalized so that the sum of joint probabilities of all classes is equal to 1. In the end, the land cover class having highest joint probability will be defined as the classification result.

Multi-temporal aspect was functioned by two time variables demonstrated in Eq. (1). Time information was extracted from imagery acquisition date (DOY: date of year) and was demonstrated by the coordinates in the trigonometric circle. The two time variables were then added to feature vectors. Therefore, feature vectors were comprised by spectral information and temporal information. The kernel-based probabilistic classification was then conducted for the feature space.

$$ [t_{1} ,t_{2} ] = \left[ {\cos \left( {2\pi \frac{DOY}{{DOY_{max} }}} \right),\sin \left( {2\pi \frac{DOY}{{DOY_{max} }}} \right) } \right] $$
(1)

Imagery data were managed in 14 square tiles in order to avoid large volume data processing. Each tile has size of 1° × 1° (Fig. 3). The classification was conducted separately for each tile and each dataset including Landsat, ASTER, SAR and ancillary data. Training data were prepared for each tile. In order to avoid edge mismatching effect, the spatial extent of training data of its corresponding tile was expanded toward outside of its tile 0.5°. The classification process was supported by the Saclass software version 1.2 developed by University of Tsukuba and JAXA, and a computer having General-Purpose computing on Graphics Processing Units (GPGPU).

Fig. 3.
figure 3

Schematic diagram of the classification process

We categorized 9 land cover types (Table 2) based on knowledge of geography in the study area. This region includes remarkable orchard areas in intensively cultivated land allocating mainly in hill lands in the North of Red River Delta and in the plain. Forest is dominated by evergreen broadleaf forest whereas bamboo, deciduous and needleleaf vegetation occupied small percentages. Therefore, we used only a forest category for all the forest types. In order to support convertibility between land cover category systems for comparative studies or further applications, we used LCCS classification system recommended by FAO [32] to set up our category system with standard definitions and standard hierarchical classification schemes (Table 2).

Table 2. Land cover categories description in this study

3 Results and Discussion

Figure 4 shows the resulting land cover map in 2015 (Fig. 4a) and the resulting land cover map in 2007 (Fig. 4b). The land cover categories are described in Table 2. The overall accuracies of the land cover maps in 2015 and 2007 are 89.0% and 81.3%, respectively and the kappa coefficients are 0.8723 and 0.7815 respectively (Table 3). The classes with lowest accuracies in both land cover maps are grasslands and other crops. In terms of land cover map 2015, grasslands showed 73.6% and 81.3% in user’s accuracy and producer’s accuracy respectively, while these number of other crops showed 76.5% and 82.9%. The other land cover types with low accuracies is barren. The land cover types which have highest accuracies include water, urban, urban and built-up and forest with accuracies ranging from 88% to 97% in land cover map 2015. The reason of this uncertainty of grasslands and other crops may come from the similarity of spectral characteristic of other crops and grasslands, while barren exerts high possibility to mixed with dry terrace paddy, grasslands and urban areas.

Fig. 4. (.
figure 4

a) Land cover map in 2007 and (b) Land cover map in 2015. A, B, C are the selected sites for analyzing land cover change

Table 3. Confusion matrix of the land cover map in 2015 (a) and the land cover map in 2007 (b)

During the last decade, research works, government’s information and media documented that there are significant changes in land use/land cover in Northern Vietnam due to forest plantation [24, 25], transformation of land use purposes [33, 34], construction of large hydropower dams [26] and urbanization. In this paper, we focused on the most outstanding changes which are forest gain and land cover changes in Son La hydropower plant’s reservoir area before and after the activation of the plant in 2010. In order to analyze those changes, we selected some focusing areas which those changes occurred: site A is the Son La hydropower reservoir area; site B and site C are places which were considered forest gain [25] (Fig. 4). Figure 5 show the magnitude figures of the selected sites in 2007 and 2015.

Fig. 5.
figure 5

The selected sites for analyzing land cover change between 2007 and 2015. The solid gray line shows the administrative boundaries of communes.

To analyze the changes, we used area ratio to compare the dynamics of each land cover type in commune level. The administrative boundaries of communes were demonstrated by solid gray lines in Fig. 6. The area ratio was calculated by dividing area of each land cover type in a commune by the total area of the commune.

Fig. 6.
figure 6

The comparison of area ratio of the land cover categories for each commune in 2015 and 2007. In each single graph, the Y axis shows the ratio (in percentage) of a land cover type’s area in a commune to the total area of the commune in 2015; the X axis shows the ratio in 2007. The solid gray line shows the 1:1 relationship. The solid black line shows the linear regression line.

Figure 6a to g show the ratio area comparison of water, urban and built-up, rice, other crops, grasslands, barren and forest respectively in site A. In water category, the area ratio showed a greatly increasing trend (Fig. 6a) whereas in other crops category, the area ratio indicated an obviously decreasing trend (Fig. 6d). This change was also depicted visually in Fig. 5a and b through the replacing cropland in 2007 with water body in 2015 over the Son La hydropower reservoir area. In mountainous area of Northern Vietnam, people tend to live near rivers and use the narrow plain along the rivers for agriculture. Therefore, cropland area around the river was eliminated by water in the reservoir area [35]. The area ratio in the urban and rice categories showed virtually no change between 2007 and 2015 (Fig. 6b and c). In grasslands category, the area ratio showed slight increase between the periods (Fig. 6e). It may due to abandoned cropland following the evacuation of a great number of people from the reservoir area [26]. The area ratio in the barren and forest categories showed mixed behavior pattern (Fig. 6f and g).

Forest gain between 2007 and 2015 in site B (Fig. 5c and d) and in site C (Fig. 5e and f) were visually illustrated through replacing grasslands and barren in 2007 with forest in 2015. Along with the increasing area ratio in the forest category (Fig. 6o and w), the area ratio in the grasslands category (Fig. 6l and t) and the area ratio in the barren category (Fig. 6n and v) showed decreasing trend over the two years.

In site B, the area ratio in other land cover categories including water (Fig. 6h), urban (Fig. 6i) and rice (Fig. 6j) showed no significant change. Almost all the data in the plots were close to the 1:1 relationship line. In site C, the area ratio in the water category (Fig. 6p) and urban category (Fig. 6q) showed no significant change. In the rice category of site C, however, the area ratio shows slightly decreasing trend with R2 = 0.9 between the two years (Fig. 6r). As site C locates in suburban area, the decline of rice paddy areas may be caused by urbanization in which paddy areas are commonly transferred to urban and built-up or infrastructure [36].

4 Conclusion

The land cover maps in 2007 and 2015 of Northern Vietnam produced by kernel-based probabilistic classification using multi-temporal and multi-sensor data achieved the accuracies of 81% and 89% respectively. These results introduced high-resolution regional land cover data in Northern Vietnam for environmental modeling or other regional studies. Moreover, in comparison to other land cover dataset such as MODIS, CCI or FROM-GLC, the innovation of our land cover data is that rice paddy, which is very important land cover type in South East Asia, was separated from cropland. The most outstanding land use/land cover dynamics between the two years which are forest gain and land cover changes induced by Son La reservoir’s presence were well captured by the results. The analysis showed that there was a large increase in water are and there was a great decrease in cropland area in the Son La hydropower dam are between 2007 and 2015. In the selected sites for forest gain detection, obvious increase in forest area and noticeable decrease in grassland area were also revealed by the analysis.

Northern Vietnam has been considered to have large changes in land use/land cover in near future because of (1) human activities including urbanization, resources exploitation and changing land use purposes and (2) vulnerability to environmental changes including climate change, sea level rise and disasters. Therefore, it is of importance to continually monitor changes in land cover using satellite data and using land cover data in environmental modeling. That leads to an essential need for improving land cover data products in terms of accuracy, resolution and classification schemes.

For further applications or other interests, readers can download the land cover map results in this study on the JAXA/EORC website:

http://www.eorc.jaxa.jp/ALOS/en/lulc/lulc_vnm.htm