Introduction

A digital surface model (DSM) is a representation of the shape of the Earth’s surface. Several near-global DSMs have been produced from satellite-borne platforms from either radar, e.g. SRTM (Farr et al. 2007) or stereoscopic optical imagery, e.g. ASTER (Meyer et al. 2011). We deliberately distinguish between a DSM and a digital elevation model (DEM) also sometimes known as a digital terrain model (DTM), where a DEM/DTM represents the solid topographic surface, whereas a DSM represents the surface sensed, which includes the height of vegetation canopy and man-made structures (cf. Hirt 2014). A satellite-derived DSM should be treated for speckle noise (Gallant 2011) and stripe noise (Tarekegn and Sayama 2013), and then, it can be converted to a DEM by accounting for absolute biases (Crippen et al. 2016) and tree height biases (O’Loughlin et al. 2016). Yamazaki et al. (2017) have treated the SRTM v 2.1 DSM for all these four sources to produce the MERIT3″ DEM. DEMs and DSMs should also be checked for other artefacts such as spikes, pits and line defects (e.g. Hirt 2018).

DEMs and DSMs are used synonymously in several applications such as mapping soil and vegetation (e.g. Dobos and Hengl 2009; Cavazzi et al. 2013), studying natural hazards (e.g. Gruber et al. 2009; Demirkesen 2012), catchment geomorphology and hydrology (e.g. Barnes et al. 2014; Zhao et al. 2019), watershed modelling (e.g. Park et al. 2011; Li et al. 2019), floodplain mapping (e.g. Jafarzadegan and Merwade 2017; Nardi et al. 2019), weather and flood forecasting (e.g. Truhetz 2010) and gravity field forward modelling (e.g. Banerjee and Gupta 1977; Forsberg 1984). The exemplar citations made above are not exhaustive because the literature on applications is so vast. However, researchers have started analysing the effect of using a DSM and not the “required” DEM for their respective applications, such as done by Yang et al. (2019) for gravity forward modelling. In this paper, we have used the terms DEM or DSM separately in many instances so as to reinforce the difference between the two.

Since the procedures for generating DSMs vary due to the different types of datasets or sensors involved (Gesch 2012), one should not generally rely on freely available DSMs without appreciating the accuracy/precision required for the application at hand. Rodriguez et al. (2005) and Farr et al. (2007) provide global accuracy analyses of the SRTM DSMs. Meyer et al. (2011) conduct a global accuracy assessment for ASTER. DEM/DSM assessments have also been made on regional scales (e.g. Nikolakopoulos et al. 2006; Racoviteanu et al. 2007; Hayakawa et al. 2008; Chirico et al. 2012; Gesch et al. 2012; Suwandana et al. 2012; Li et al. 2013; Jing et al. 2014; Purinton and Bookhagen 2017; Elkhrachy 2018; Zhang et al. 2019; Hawker et al. 2019) and countrywide scales (e.g. Hilton et al. 2003; Denker 2005; Hirt et al. 2010; Athmania and Achour 2014; Gesch et al. 2014; Ioannidis et al. 2014; Rexer and Hirt 2014; Varga and Bašić 2015). We attempt to add to this body of literature by providing results from the whole country of India, where the topographic morphology is quite diverse: heights range from –2 m to + 8586 m and terrain gradients sometimes exceed 45° (2.4% of the total cells at 1″ × 1″ resolution, i.e. 3,748,582,709 cells). While studies have been conducted on the comparison and validation of different DEMs/DSMs in smaller regions of India (see Table 1), none are countrywide as we attempt in this investigation.

Table 1 Previous DEM/DSM assessment studies in India

India hosts part of the Himalaya Mountain Ranges in the north, the Gangetic Plain in the centre, the Aravalli and Vindhya Mountain ranges, the Western and Eastern Ghats, the Deccan Plateau, the Thar desert and a long peninsular coastline (Fig. 1). Thus, accuracy/precision assessment of DEMs/DSMs for India is of utility, especially when researchers are already using freely available DSMs for applications in India such as geology and geomorphometric analysis (e.g. Selvan et al. 2011; Gayen et al. 2013), watershed delineation (e.g. Sreedevi et al. 2009; Ahmed et al. 2010; Gopinath et al. 2014), identifying potential water harvesting sites near rivers (e.g. Ramakrishnan et al. 2009), assessment of tsunami risk (e.g. Kumar et al. 2007), hydrographic modelling (e.g. Patro et al. 2009) and estimating glacial mass balance (e.g. Berthier et al. 2007).

Fig. 1
figure 1

Source: https://www.nationsonline.org/oneworld/map/India-Administrative-map.htm)

Physical features of Indian topography. (

Unlike some of the previous studies in India (Table 1), and indeed elsewhere, we have deliberately preserved the respective meanings of DEM versus DSM throughout our analyses. Strictly, DEMs and DSMs should never be compared until one is transformed to the other (Yamazaki et al. 2017). In the study presented here, four freely available DSMs for India (SRTM1″, SRTM3″, ASTER1″ and Cartodem1″ [an India-only model; see below]) are inter-compared on a model-to-model basis. They are also “validated” with independent ground-truth height data provided by the Survey of India (SoI) to which National Aeronautics and Space Administration (NASA) canopy height information (Simard et al. 2011) has been added to give point DSM heights (Sect. 4). Along with these four DSMs, the MERIT3″ DEM is also validated with the same ground-truth data, without canopy heights applied. MERIT3″ was not included in the model-to-model DSM comparison. In India only, the national Cartodem DSM, derived from the Cartosat mission using stereoscopic optical imagery (NRSA 2006), is also used in regional applications (Bera et al. 2014; Das et al. 2015, 2018; Kumar and Gupta 2016), so we include this DSM in our assessments. The DSMs and DEMs evaluated are summarised in Table 2.

Table 2 DEMs used in the study (

Due to the land height range in India (–2 m to + 8586 m), our analysis is divided into three sub-parts based on classification of the heights into three intervals, with an implicit assumption that these may correlate with the broader morphology, namely H ≤ 500 m, 500 m < H ≤ 1500 m and H > 1500 m (Fig. 2b). The rationale behind the chosen three intervals is: regions of the Gangetic plains, the Thar desert and the peninsular coastline are all below 500 m; the whole of the Aravalli range (except a few peaks), the Vindhya range, majority of the Eastern Ghats and half of the Western Ghats are between 500 and 1500 m, while the other half of Western Ghats, a small extent of Eastern Ghats and almost whole of the Himalayan belt are above 1500 m. The claimed accuracies/precisions for all the DEMs/DSMs (Table 2) are also cross-checked on whole of India and height-range-wise bases. This is of utility because the accuracy statistics defined from global assessments may not be applicable to India, which certainly appears to be the case for high-elevation areas.

Fig. 2
figure 2

Terrain of India a and the three height ranges tested b (equi-rectangular projection)

Subtleties of Indian Height Data

The nominal vertical datum of the Cartodem DSM is WGS84 and it thus provides ellipsoidal heights of the Earth’s surface. To achieve a consistent vertical datum among the DSMs (cf. Table 2), the Cartodem was also referenced to EGM96 (Lemoine et al. 1998) by subtracting EGM96 geoid undulation values and rounding to the nearest metre as was done when computing SRTM physical heights (cf. Farr et al. 2007, p. 19). EGM96 is an older spherical harmonic degree-360 geopotential model, and comparatively better high-degree geopotential models are now available, such as EGM2008 to degree 2190 (Pavlis et al. 2012, 2013). To show the effect of using EGM2008 instead of EGM96, a difference map was prepared and truncated to the nearest metre. Figure 3 shows that DEMs/DSMs derived from each geoid model can differ by up to 12 m in magnitude, particularly in the Indian Himalaya (cf. Fig. 1). The effect of the different geoid models will be assessed later in Sect. 3.3.

Fig. 3
figure 3

Geoid differences between EGM2008 and EGM96, truncated to the nearest metre (equi-rectangular projection)

As well as model-to-model comparisons, the DEMs/DSMs are “validated” with independent ground-truth data, comprising 3842 differentially levelled benchmarks and 145 ground control points (GCPs).

  • The 3842 benchmarks (Fig. 4) consist of latitude, longitude and levelled heights above local mean sea level (MSL). They come from the database archived by the Bureau Gravimetrique International (BGI) and were originally sourced from the SoI and the Indian National Geophysical Research Institute (NGRI). Though the horizontal and vertical precisions are not known, all the relevant infrastructure and research projects in India are based on benchmarks established by SoI. These are the heights that we have used in our analysis. Vertical precisions are important to be confident that we are not validating the DEM/DSM heights with erroneous ground control. Horizontal precision is important to be confident that we are not interpolating the DEM/DSM height to the wrong location, which can be a substantial problem in areas of steep terrain gradients.

  • The 145 GCPs consist of GNSS-determined latitude, longitude and ellipsoidal height. Geoid undulation values from EGM96 were subtracted from these ellipsoidal heights to determine physical heights that are compatible with the DEMs/DSMs (cf. Table 2), but not rounded to the nearest metre. The GCPs are concentrated in five different regions of the country: Hyderabad, Bangalore, Kanpur, Dehradun and Saharanpur (Fig. 5). The GCPs in Kanpur were observed using dual frequency GNSS, while GCPs at other locations were obtained from the SoI archive. The horizontal and vertical precision of these data lies within 12 to 26 mm and 31 to 53 mm, respectively (Mishra 2017).

Fig. 4
figure 4

Spatial distribution of the 3842 levelled benchmarks (equi-rectangular projection)

Fig. 5
figure 5

source: Google Earth)

Spatial distribution of the 145 GCPs (

We return to the caveat in the first paragraph of Introduction, qualifying that a DEM is distinctly different from a DSM. The benchmarks and GCPs give the physical (MSL-based) heights of the solid ground, so are compatible with DEMs, but not with DSMs. Therefore, in the later analysis (Sect. 4), canopy height (CH) information is added to the ground-truth data for comparison with DSMs in order to achieve compatibility. We have not conducted an analysis of the veracity of the CH model over India, instead taking the published values “at face value”. We also acknowledge that other corrections are needed, as outlined in Introduction.

Inter-Comparison Among DSMs

The SRTM v4.1 DSM was first bicubically interpolated from 3″ × 3″ to 1″ × 1″ resolution to make it spatially consistent with the other three DSMs (SRTM v3.0, ASTER GDEM2 and Cartodem; Table 2). The DSMs were compared according to three criteria:

  1. 1.

    For the whole country of India, producing a total of 3,748,582,709 1″ × 1″ DSM differences

  2. 2.

    For DSM heights divided into three ranges, namely H ≤ 500 m, 500 m < H ≤ 1500 m and H > 1500 m (Fig. 2b)

  3. 3.

    For four intervals that are defined according to the claimed accuracies/precisions of the DSMs (Table 6 later).

Finally, we replace EGM96 with EGM2008 for all the DSMs to gauge the effect of using a higher-degree geoid model to obtain physical heights from a DSM.

Nationwide Inter-Comparison

Possibly the most alarming observation from Table 3 is that the DSMs can differ by several kilometres, though the percentage of such pixels is proportionally small (Table 4). These large height differences among the DSMs are most probably due to geolocation errors (Rodriguez et al. 2005), i.e. horizontal shifts among the DSMs are caused by incorrect co-registration (Denker 2005). These shifts result in comparing DEM/DSM cells of two different locations, hence producing substantial height differences, especially in areas of steep terrain gradients. Also, from Table 4, the number of pixels in different ranges for S1-AS individually and S1-AS and AS-CA collectively show that SRTM1″ and ASTER are more consistent with one other than the other model pairs. [The abbreviations for the DSM names are given in the first row of Table 2.] This consistency is also backed up by only 0.1% of the difference pixels for S1-AS lie beyond the range [ − 100 m, 100 m]. Also, on analysing the three pairs i.e. S1-AS, S3-CA and AS-CA, it is observed that the Cartodem, compared to SRTM3″, has more congruency with SRTM1″ and ASTER. This is probably only because SRTM3″ was bicubically resampled to a 1″ × 1″ spatial resolution. The total number of pixels in each 1″ × 1″ DSM is 3,748,582,709, and ∆H represents the difference among various pairs of DSMs (e.g. S1-S3, S1-AS, S1-CA, S3-AS, S3-CA and AS-CA).

Table 3 Statistics of inter-comparison among DSMs. Units in metres. The abbreviations for the DSM names are given in the first row of Table 2
Table 4 Distribution of “large” differences among the DSMs over India. The abbreviations for the DSM names are given in the first row of Table 2

Figure 6 shows the striping effects among the DSMs. Striping in ASTER was also observed by Hirt et al. (2010) over Australia. Considering the fact that SRTM have stripe effects with a different pattern compared to ASTER (cf. Gallant and Read 2009), and on comparing (i) Figs. 6b,c and (ii) Fig. 6d, e, it can be claimed that Cartodem also has the stripe effects that are nearly in the same direction as ASTER (Fig. 6c, e). Stripes are also shown in Fig. 6f (AS–CA), indicating the non-negligible difference in the magnitude of the stripes in ASTER and Cartodem. Hirt et al. (2010) pointed out that the stripe effects in ASTER occur on scales of several thousand kilometres; Fig. 6 shows the similarity of this phenomenon for Cartodem in India.

Fig. 6
figure 6

Visual representation of the differences between DSM pairs over India, showing stripes. [a S1-S3; b S1-AS; c S1-CA; d S3-AS; e S3-CA; f AS-CA]. The abbreviations for the DSM names are given in the first row of Table 2

Height-Range-Wise Inter-Comparison

Table 5 shows that, despite the lowest standard deviations (STDs) of ∆H for the height range H ≤ 500 m, large differences exist among DSMs (cf. Table 4). The significant differences between S1-S3 (both derived from the same satellite mission) are possibly due to systematic errors between the two DSMs, primarily found in the mountainous regions. This is possibly because SRTM1″, a high-resolution DSM, provides a better topographic representation compared to SRTM3″, especially along ridges and valleys. Other discrepancies among Cartodem and other DSMs are also observed at the locations of large lakes and active open-pit mine sites (Fig. 7). This is due to the different epochs of the observations and re/processing involved in the development of each DSM, which is [partly] reflected by the release dates in Table 2. A similar observation has been reported by Long et al. (2020) over open-pit mines in Quang Ninh Province in Vietnam.

Table 5 Statistics of the DSM inter-comparison based on a range-wise classification. Units in metres. The abbreviations for the DSM names are given in the first row of Table 2
Fig. 7
figure 7

Differences between SRTM1″ and Cartodem (greyscale panels) at the locations of large lakes and active open-pit mine sites (background images from Google Earth)

Inter-Comparison According to DSM Claimed Precision

We deduce four accuracy/precision intervals according to the claimed accuracies/precisions of the DSMs (Table 6). The percentages of points lying in these different intervals are shown in Table 7.

Table 6 Accuracy/precision intervals as deduced from other investigations
Table 7 Percentage of pixels (from model-to-model comparison) lying in the intervals set in Table 6

From Table 7, the percentages of pixels in intervals In1 and In2 for S1–AS, S3–AS and AS–CA show that ASTER contains more error compared to the other three DSMs. The claimed accuracies/precisions are only valid if 90% of the data satisfy the given accuracy requirements (cf. Rodriguez et al. 2005). In the lowland range (< 500 m), more than 90% of the differences for S1–S3, S1–CA and S3–CA lie in the interval In2. This indicates that the three DSMs (i.e. S1, S3, CA) are congruous with their claimed accuracies, but only in this height range. It is found that 90% of the total S1–CA difference pixels (without any height-banded classification) fall within ± 8 m, which resembles the observations of 90% by Muralikrishnan et al. (2013) and Bothale and Pandey (2013).

Finally, the overall statistics and the percentage of pixels in different accuracy/precision intervals after replacing EGM96 by EGM2008 for all the DSMs are summarised in Table 8. This shows no significant change either in the overall statistics (cf. Table 3) or the distribution of differences (cf. Table 7) after transforming the DSMs to physical heights using EGM2008. Therefore, it appears immaterial as to which geoid model is used to transform the geometric ellipsoidal heights to physical heights given the former’s intrinsic accuracy/precision (cf. Figure 3), but this only applies to India and might not be the case in the countries with relatively lower topographical elevations.

Table 8 Statistics (cf. Table 3) and percentage of points in different intervals (cf. Table 6) after replacing EGM96 by EGM2008 geoid values. Units in metres

Validating DEMs with Ground-Truth Physical Heights

The DEMs are now “validated” with two sets of independent ground-truth data: 3842 levelled benchmarks and 145 GPS-based GCPs. Recalling from Sect. 2, the ellipsoidal heights of GCPs were converted to physical heights by subtracting the EGM96 geoid model. Since SRTM1″, SRTM3″, ASTER and Cartodem are all DSMs, canopy height (CH) data from NASA (Simard et al. 2011) were added to the ground-truth point heights. The CH data were not subtracted from the entire DSMs pixel by pixel because the conversion of a DSM to a DEM also involves extra filtering techniques as summarised in Introduction. Thus, just removing the CHs does not necessarily provide a true DEM, but we believe it to be better than using a DSM alone. We did not conduct an analysis of the veracity of the CH data, instead taking the NASA model at face value.

Figure 8 shows the distribution of the heights of the 3842 benchmarks and 145 GCPs. They reflect the difficult logistics of collecting surveying data at inaccessible altitudes. As such, this validation only really holds for elevations less than, say, ~ 500 m (cf. Fig. 8a). In addition, the only sample geographically limited parts of India. Table 9 shows the statistics of comparisons between the DSMs/DEMs and these two ground-truth datasets, where the CH has been added when assessing the DSMs. For the heights extracted from Cartodem, there are two points with unexpectedly large height differences (i.e.  − 191 m and  − 186 m). These points are not removed from the analyses because the overall statistics of the comparison after removing them does not change significantly (min =  − 336.7 m, max = 270.2 m, mean = 0.9 m, MAE = 8.4 m, STD = 19.2 m and RMSE = 19.2 m).

Fig. 8
figure 8

Distribution of the a levelled benchmark heights: max 4057.2 m, min 1.5 m, and b GCP heights: max 2002.7 m, min 124.8 m

Table 9 Statistics of comparison between ground-truth heights and the DEMs/DSMs. Units in metres. The abbreviations for the DSM names are given in the first row of Table 2

The statistics in Table 9, when viewed collectively and more so by the mean absolute error (MAE) and root mean square error (RMSE), indicate that MERIT3″ compares relatively closer with respect to the ground-truth heights as compared to the DSMs. This is most probably because other error sources (mentioned in Introduction) were removed in the construction of MERIT3″ (Yamazaki et al. 2017), whereas we have only applied the CHs to the ground-truth in this study. The better results for MERIT3″ with respect to the GCPs can also be attributed to GPS data generally being collected in open areas (away from buildings/trees) for satellite visibility. Therefore, there is less probability of CH error due to the presence of man-made features or vegetation (cf. Denker 2005; Hirt et al. 2010).

We next repeat the analyses conducted among the DSMs, but now with the ground-truth data, including the MERIT3″ DEM, and after CHs have been added to the ground-truth when DSMs are assessed. We restrict the presentation here to only the levelled benchmarks because of the larger sample size with broader spatial (Fig. 3) and vertical (Fig. 8) distributions versus the GCPs (cf. Figs. 4, 8). Our analyses with the GCPs do not contradict the findings presented below. The DEM/DSM comparisons with height-range-wise and accuracy/precision-wise classification are given in Tables 10, 11, respectively.

Table 10 Statistics of the comparison with benchmarks based on range-wise classification. The model with the least MAE and RMSE values is the most preferred. Units in metres
Table 11 Percentage of points lying in different accuracy/precision intervals (cf. Table 6). The model with the highest percentage in intervals In1, In2, In3 and the lowest percentage in interval In4 is the most preferred

First, however, it is important to acknowledge that the number of benchmarks with MSL-based land elevations greater than 500 m is relatively few (Fig. 8 and Table 10). As such, while all results are presented for the sake of completeness, lesser emphasis on the interpretation is made from them when H > 500 m. This is also demonstrated in Fig. 9b-d, where the differences become more scattered for the higher-elevation intervals. Figure 9a shows that all the differences are near-normally [Gaussian] distributed, hence justifying our use of descriptive statistics throughout this manuscript.

Fig. 9
figure 9

Distributions of differences among benchmarks and the MERIT3″ DEM for different intervals. a All 3842 data points, b 3273 points below 500 m, c 395 points between 500 and 1500 m, and d 174 points above 1500 m. Note the different y-axis scales

With the data available to us, focussing on the < 500 m band in Table 10 shows that, despite the presence of large maximum and minimum differences, MERIT3″ is more reliable, while Cartodem is less preferred among all the compared DEM/DSMs. The principal metrics used from Table 10 to make this inference are the MAE and RMSE. From the percentages in Table 11, no DEMs/DSMs have more than 90% points falling in the In1 or In2 intervals, which are defined based on the claimed DEM/DSM accuracies/precisions (cf. Table 6). In the < 500 m range only, however, all the DEMs/DSMs (except ASTER) have more than 90% of the points in the In3 interval. ASTER provides the smallest percentage in the interval In1 and the highest in In4, indicating it to be the least preferred DSM with respect to the ground-truth data in India. Thus, for the 1″ × 1″ DSMs, SRTM1″ and Cartodem appear more reliable as compared to ASTER over India. The MERIT3″ DEM has the highest percentage of points in intervals In1, In2 and In3 and the lowest in In4, indicating to be most preferred among all the five models compared to the ground-truth benchmarks in India.

Conclusions

In this study, four freely available DSMs (SRTM1″, SRTM3″, ASTER1″ and Cartodem1″) along with the MERIT DEM developed by removing multiple error components from SRTM3 v2.1 are investigated based on a model-to-model comparison over the whole of India and a “validation” using ground-truth benchmark height data over some regions of India. Since India has varying topography (land heights range from  − 2 m to + 8586 m), the heights were divided into three ranges, namely H ≤ 500 m, 500 m < H ≤ 1500 m and H > 1500 m. The percentage of points lying in the claimed accuracy/precision limits for different DEMs/DSMs were also analysed.

The model-to-model comparison among DSMs shows that SRTM1″, SRTM3″ and Cartodem are congruous with their claimed accuracy/precision, but only for heights less than 500 m. Cartodem has the least discrepancies with SRTM1″ compared to ASTER and SRTM3″ in all three height ranges tested. There are artefacts between Cartodem and other DSMs due to time-varying heights in lakes and open-pit mining sites. Visual representation of the DSM differences confirmed that stripe effects are present in SRTM, ASTER and Cartodem over India, which appear to have been eliminated/reduced following the procedures involved in the production of MERIT3″ (Yamazaki et al. 2017).

The validation with the only ground-truth data available to us shows that no DEMs/DSMs satisfy their claimed accuracies (intervals In1 and In2 in Table 6) in any height range. However, for elevations less than 500 m only, DEMs/DSMs (except ASTER) satisfy interval In3, but which is still beyond their claimed accuracies/precisions. The MERIT3″ DEM is observed to be more reliable compared to the other DEMs/DSMs based on overall, range-wise and accuracy-wise analyses. However, this needs to be qualified by our use of only canopy heights to convert the ground-truth data to DSM-compatible heights.