Keywords

1 Introduction

Drainage information has been playing an important role in modeling geomorphological, hydrological, and ecological processes because of its direct reflection of water convergence and heat redistribution over land surface (Moore et al. 1991; Passalacqua et al. 2010). Digital terrain models (DEMs) are basic data for extracting drainage information. To meet the requirements from research and practice, topographic features have been always taken as a fundamental feature layer on maps and in geospatial databases. Technologies in remote sensing and photogrammetry have been developed for producing high resolution and widely available DEMs, for example the SRTM elevation data (Farr et al. 2007), which further poses new opportunities and challenges in digital terrain modeling. Being a central component of drainage information, drainage networks have been an increasing research focus for both the importance in various spatial models in earth science and the abundant byproducts during extraction process (Mark 1984; Jenson and Domingue 1988; Band 1986; Tarolli et al. 2012; Wilson 2012). In cartography and geographical information science, drainage networks, which are one component of topographic surface networks (Rana 2004), have been also of great interest to facilitate geographic information generalization, data compression and highly efficient terrain analysis algorithms (Wu 1981; Weibel 1992). High density drainage networks are essential prerequisites in many other fields of earth science (Jordan 2007).

The data source for extraction of drainage networks can be in the form of contour, triangulated irregular networks (TINs) or gridded DEMs. The algorithms based on contour lines involve spatial relationship detection among contour lines, curvature computation of contour lines, setting threshold value of characteristic points, linking characteristic points into drainage segments and building of drainage networks. Triangulation or map algebra can be used to facilitate the detection of characteristic points or segments (Aumann et al. 1991; Kweon and Kanade 1994; Ai 2007). However this type of methods suffers from determination of multiple threshold values and ambiguous selection of contiguous points for connection of segments. In addition, contour lines are generated from gridded DEMs in most cases nowadays, which resamples terrain surface and introduces information loss.

Theoretically, a TIN would be the best data source for extraction of drainage networks because vertices and edges in a TIN are defined as local terrain characteristics. However, an elevation TIN is normally either generated from contour lines or interpolated based on regular elevation grid and most methods for generating a TIN cannot ensure that the edges can meet defined geometric requirements. In order to derive drainage networks, Falcidieno and Spagnuolo (1991) designed Characteristic Region Configuration Graph based on geometric properties between neighboring triangles and defined boundaries lines between two distinctive regions as edges of surface networks. However Kreveld (1997) and Brandli (1996) argued that this algorithm simplifies the complexity of terrain surface and does not work for complex regions. Yu et al. (1996) proposed an algorithm based on a flow routing method that traces steepest downstream lines to generate flow accumulation. Then drainage networks can be filtered out using given threshold values. Zhou et al. (2011) proposed a TIN-based algorithm for drainage information extraction, in which gridded-DEM cells are partitioned into a triangular facet network. Then the flow path of each triangular facet is defined by its aspect and routing to an outlet or a local pit according to downstream neighboring facets. To compute the catchment area for a specific cell, all the upstream cells are summed up by counting flow paths passing this cell. However uncertainties would be inevitably introduced in the transformation from a DEM to a facet TIN. In addition, computations of a DEM to a TIN, of vector flow paths and of catchment area are consuming large computational resources.

Due to a wide availability of gridded elevation data at various resolutions, most research and practical efforts to derive drainage networks are dependent on gridded DEMs. For the same reason, there have been extensive investigations on algorithms of automatic extraction of drainage lines based on gridded DEMs by researchers from the disciplines like GIS, computer science, hydrology and geomorphology. Wood (1996) proposed a classification based on five criteria. Generally there are two types, those based on morphometric analysis methods by template detection or curvature filtering over surface fitting, and those based on simulation of water runoff and flow routing over terrain surfaces.

The morphometric analysis algorithms based on filtering by fixed or variable size kernels are very efficient and easy to be implemented in a parallel environment (Band 1986; Douglas 1986; Bennett and Armstrong 1996). The geometric concavity analyzing can be used to identify characteristic points over terrain surface. It can be implemented in two ways, comparing elevation values on profiles in a given window, or computing curvatures on a fitted polynomial surface. Then successive steps include linking characteristic points into lines, image thinning and further building of drainage networks. These sub-tasks are both difficult to implement and involve multiple threshold values settings. Various methods (Wood 1996; Schneider 2003) have been designed to manipulate them and multi-criterion analysis methods are used, which would introduce uncertainties into results. Thinning algorithms from image processing are usually used to generate connected drainage lines of a single pixel width, in which physical characteristics of terrain surface obviously are not considered. Another disadvantage of these algorithms is that further processing has to be done to extract watershed subdivisions and other drainage information and to correlate them with drainage networks.

The second type of algorithms has a physically sound theoretical assumption which imitates water runoff flowing and accumulating process over terrain surface (O’Callaghan and Mark 1984; Jenson and Domingue 1988; Freeman 1991; Martz et al. 1992; Planchon and Darboux 2001; Endreny and Wood 2003). The basic assumption is that water runoff drains out of land into the sea along downslope flow paths. Areas where the accumulated amount of flow paths is greater than a given threshold value are taken as drainage channels. In this assumption, every point on land surface should have one or multiple downstream flow directions. However, the points in depressions or pits which are surrounded by higher points and in flat areas cannot find proper flow directions. Those points would block upstream water runoff and affect correct generation of accumulation. There have been various ways to handle depression areas, for example image smoothing to eliminate small depressions (O’Callaghan and Mark 1984), elevating all points to the lowest pour height (Jenson and Domingue 1988; Wang and Liu 2006), leveling higher points along flow paths down to pit’s height (Martz and Garbrecht 1998). All of the strategies have influence on the forms of extracted drainage networks and a hybrid approach can produce better results comparing to river network from topographic maps (Poggio and Soille 2012). The second critical task is to assign flow directions to all cells in gridded DEMs. Special treatments should be made to points in flat areas which are produced by depression filling or inherent in original DEMs. Flow directions of these points can be assigned by introducing the direction information from rim of depression during the depression filling step (Jenson and Domingue 1988), or by altering elevation values randomly (Planchon and Darboux 2001) or linearly (Pan et al. 2012) to produce a gentle slope. To distribute water runoff from one point to its downslope neighbors, single-flow direction (SFD) and multiple-flow direction (MFD) algorithms have been proposed and evaluated in various scenarios (Zhou et al. 2011). MFD has a sound basis to simulate the divergent flow over ground. However recent investigations demonstrate that MFD algorithms produce broad drainage segments and patterns which are not well defined to show flow convergence and single flow direction algorithms show thier advantages (Zhou et al. 2011; Poggio and Soille 2012).

Considering the two types of methods with gridded DEMs, there are at least two main issues which have not been addressed carefully. Both morphometric analysis algorithms and flow routing algorithms produce broad areal shaped drainage channels, which cannot clearly represent channel positions and further affect parameter calibration in distributed hydrological modeling. Algorithms based on image thinning and operators of mathematical morphology can be used to generate single-pixel-width drainage segments. But apparently physical terrain characteristics are not considered. The second problem is that the methodology using threshold values can always miss drainage segments partly or even in total, which may in many circumstances fail to identify typical drainage segments, especially in upstream areas. Smaller threshold values produce high density drainage networks. But numerous pseudo and parallel drainage segments normally appear at downstream area (Tarboton 1997). It leads to difficulty of finding correct positions of valley heads and drainage density which are important to investigate geomorphological activities (Tribe 1991).

This chapter describes a new method by employing water runoff flow routing algorithm to derive the global structure of drainage networks, and the morphometric algorithm to supplement missing drainage cells at upstream area. A noise cleaning procedure is designed to remove problematic segments based on hydrological and geomorphological properties and to improve topological consistency of drainage networks. Based on test of two datasets of different terrain types, a quantitative analysis is made to the extracted drainage networks. Conclusions and discussion are provided at the end.

2 Methodology and Algorithms

Algorithms based on water runoff flow routing have a sound physical basis and can generate topologically explicit drainage networks and watershed subdivision by threshold values of accumulation. Morphometric algorithms can produce good results in the areas where drainage segments are well defined. These two types of algorithms are complementary if sensitivity to threshold values can be reduced and a better method can be designed different from image thinning.

2.1 Primary Drainage Networks and Watershed Information

The algorithm of extracting drainage lines based on water runoff routing executes typically five steps. First, all depressions in gridded elevation matrix are to be identified and processed. There are two ways to remove the depressions. The most used one is depression filling, which raises all grid cells in a depression to a new height value equal to the elevation of lowest pour points on the rim of the depression. Another way is dam breaching (Poggio and Soille 2012). In this chapter, the former one is used to ensure that all runoff flow paths can be directed to the boundary of depression regions.

After removal of depressions, every cell is given a flow direction based on deterministic-8 method (D8). The flow direction of a cell points to one neighbor cell which has the steepest downward slope among its eight neighbors. The flat area cells, which are produced by depressions removal or inherent in original data, must be specially treated. The method which was developed by Martz and Garbrecht ( 1998) is used here to assign flow directions to cells located in flat areas. A cell is a neighboring upstream cell to the one that its flow direction points to. A cell is a neighboring downstream cell to the one that is pointed by its flow direction. Those cells without upstream (or downstream) cells can be starts (or ends) of flow paths.

Each cell is assigned one unit of water as runoff on this point. Accumulation process is started from all starts of flow paths and arrives at boundary of data or research region along flow paths. Accumulation value of one cell is the number of its upstream cells and is taken as its drainage area. The results include a flow direction matrix and a drainage area matrix, as showed in Fig. 1b. The process corresponds to the physical process of water runoff over terrain surface.

Fig. 1
figure 1

Flow directions of grid cells with elevation value; a Flow directions over original elevation matrix. Each value denotes one cell’s elevation. b Accumulation matrix with flow directions. Each value denotes amount of upstream runoff. c Drainage segments filtered out by threshold value 40. Values greater than 0 denote Strahler orders. 1 and 2 denote first and second order in light pink and blue. d Watershed subdivisions. Values greater than 0 denote watershed numbers. The two closed contour lines are located around the two depressions.

A threshold value is given in the fourth step and the drainage area matrix is segmented into a binary matrix. The grid cells with higher drainage area are marked as drainage cells and constitute drainage segments connected along flow directions. However, as stated above, an appropriate threshold would best be determined according to terrain characteristics and expert experiences. A higher threshold value commonly produces lower density of resulted drainage lines.

The last step of this stage builds the drainage networks, acquires basic information of each drainage segment. The basic information of each drainage segment consists of unique identifier, length, slope, and topological information of connected segments. The topological connectivity combined with flowing directions can be used to assign stream ordering numbers to drainage segments to distinguish their hierarchical levels. This chapter uses Strahler ordering system for its popularity in hydrological applications. One drainage segment with no other segments draining in is marked as first order (as light pink segmetns in Fig. 1c). Two or more segments with equal order can join together and form a segment with one-level higher order. A segement uses the higher order number if two segments with different order numbers join together. Then all cells along one flow path are assigned a sub-watershed number according to the drainage segment that the flow path drains into.

Each drainage segment and its associated sub-watershed share one unique identifier which is started at 1. 0 denotes cells out of research region, drains out of data boundary without enough accumulation or simply no data. The sub-watershed identifiers and Strahler orders are stored in matrices and can be converted to vector structures when needed. The implementation is showed in Fig. 2. In the figure, StrhSta is an integer array and stores the number of drainage segments on each Strahler order (SID). For each grid cell P, the values of SP, OP and IP denote corresponding Strahler order, unique identifier and flowing-in number. OID denotes the object identifer number of each drainage segment and the corresponding watershed. The topological information of drainage segments is recorded during the iteration. The length and slope can be computed based on original elevation, Strahler order and identifier matrices.

Fig. 2
figure 2

Extraction of drainage information

2.2 Complementary Drainage Lines

The threshold value in above processing dramatically influences drainage density. Figure 3 shows the changing patterns of identified drainage segments with two threshold values. The value is normally set to as small as possible in order to identify high density drainage segments. However it is difficult to set an appropriate one.

Fig. 3
figure 3

Changes of drainage patterns with different threshold values. Yellow cells are third Strahler order. a Threshold value is 20. b Threshold value is 10

From visual inspection of the map containing contour lines and identified drainage segments, we can see that most missing drainage segments are actually located at upslope areas, where water runoff accumulation may not be significant but the geometric shapes of valleys are obvious. At this kind of situation, morphometric analysis has its advantage.

The second stage of our algorithm, which includes concavity analysis in bottom-up and top-down strategy, complements the initial results into a complete version. All the local terrain characteristic cells are identified using a moving 3 by 3 kernel (Douglas 1986). Valley cells are marked as “VL”. The channel head identified in primary network detection step is used in a bottom-up process. Then all the start cells of first order Strahler drainage segments are used to connect neighboring valley cells which drain into start cells. Figure 4a shows the growing result of Fig. 3a. The top-down step is executed based on the local terrain characteristic cells with “VL” which have not been identified as drainage cells but marked as valley cells. These cells flow along the steepest downward slope until reaching current drainage cells which are not necessarily first order drainage segments. In order to acquire reasonable results, the two steps consider segments’ slope during complementing process. The characteristic cells on flat area cannot be connected into drainage lines. The drainage lines are then organized into topological networks and labeled with Strahler order.

Fig. 4
figure 4

Complementary drainage segments based on geometric characteristic analysis. VL in the cells denote grid cells of valley; a bottom-up growing; b top-down connection

3 Rule-Based Treatment of Problematic Segments

Drainage networks extracted above are organized into a topological structure during Strahler ordering procedure. However, when a threshold value is set to small and after the complementary step in Sect. 2, a large amount of problematic drainage segments may be generated in various geometric forms. Parallel segments over flat area are often produced. Congested segments are those that partly or all of cells are direct neighbors of cells on other drainage segments and they do not flow into each other. Here only cardinal neighborhood between cells is considered (4-connectivity). This situation appears everywhere, as showed in Figs. 3 and 4. In general it makes areal shaped drainage segments and fragmented watershed sub-divisions. Image thinning methods used in previous research are apparently not appropriate because all cells are treated indiscriminately disregarding flow directions. In order to remove problematic cells, drainage segments are first classified into different categories based on types of neighboring cells, flow directions, Strahler ordering numbers and characteristic significance (valley, saddle or ridge point). Special treatment methods are designed to handle different situations.

3.1 Straight Drainage Segments

This kind of drainage segments mostly appear as first Strahler order located in lower elevation areas. The shape of a typical valley is nearly straight on 2D and the slope is small on terrain profile. The drainage accumulation value along its flow path does not change remarkably because there are few other segments draining in. The cells on this kind of segments are normally not “VL” cells marked in morphometric detection. The corresponding cells are simply removed from the drainage segments (as dark green in Fig. 5). The Strahler orders of downstream drainage segments are updated after this process to maintain the topological structure of drainage networks.

Fig. 5
figure 5

Removal of dark green straight drainage segments. Cells in light gray are remaining drainage segments. Values in each cell are drainage area value. VL in the small window denotes valley cells

3.2 Congestion of First Order Drainage Segments

After the complementary cells marked according to morphometric characteristics are appended, a large number of drainage segments are congested. In order to maintain topological connectivity of drainage networks, the cleaning process is initiated from the segments with first Strahler order. Three types of congestion are further categorized for corresponding treatments.

(1) One-cell length segments. If the only cell of a 1st-order segment has more than one cardinal neighboring cell with higher order, or more than two cardinal neighboring cells with 1st order which are not at downstream direction of it, it is removed directly (the dark gray cells in Fig. 6b).

Fig. 6
figure 6

Removal of congested drainage segments at first Strahler order; a Drainage cells before processing (1). b Drainage cells after processing (1). c Drainage cells before processing (2). d Drainage cells after processing (2). e Drainage cells before processing (3). f Drainage cells after processing (3)

(2) Congestion between 1st-order segments. If there exists two adjacent segments of this type, the one with the higher accumulation value, longer length, larger slope and smaller flow direction change is to be preserved. These criteria are applied consecutively until one is met. If both segments show identical configuration considering all criteria in very rare conditions, the first one is to be removed. If the removed segment is partly left, the cell located furthest downstream need to find a new neighboring cell as its pour point to ensure topological consistency of drainage networks. The new downstream cell is selected according to lowest elevation and highest accumulation value. The accumulation values of all cells on the two affected flow paths are updated afterward. If more than two 1st-order segments are congested together, the middle one is to be reserved to keep the final result balanced geometrically. The congestion involved more than 3 segments is very rare.

As showed in Fig. 6c, d. The dark green cells are removed due to congestion between 1st-order segments. The one which is partly removed has to change the flow direction of its last downstream cell (in dark gray) to appropriate neighboring cells.

(3) Congestion of 1st-order segments with higher-order segments. A 1st-order drainage segment can be congested with higher order segments in part or in whole. The congested cells on 1st-order drainage segments are to be removed and higher order neighbor cells leave unchanged. If the middle part of one 1st-order order segment is congested with other segments, the congested part is removed and this segment is broken into two segments.

The flow direction of last downstream cell of remaining segment is changed and the accumulation values along old and new flow paths are to be updated in the same way as above.

3.3 Congestion of Higher Order Drainage Segments

This section deals with congestion occuring between segments with Strahler order equal to or larger than 2. When there are congested cells between two segments with different Strahler orders, the lower one is to be removed. If the Strahler orders are equal, the one with smaller drainage area, short length is to be processed. The processing starts from the downstream outlet of the target segment. The congested cells are removed until the last congested cell. Local flow directions and drainage area on flow path are altered at the same time with similar strategy in above section. Figure 7 shows a 2nd-order segment (in blue) is partly modified to avoid congestion with a 5th-order segment (in pink).

Fig. 7
figure 7

Lower order segment is partly removed to reduce congestion with neighboring higher order segment

3.4 Processing Iteration

The above situations account for most congestion. However, the processing must be iterated for multiple passes because the removal of higher Strahler order congestion may produce a new situation. As in Fig. 7, after congestion removal, there are two 1st-order segments generated, which may generate further congestion.

4 Evaluation and Results

The new algorithm is tested with two DEM datasets from SRTM version 2.1 (SRTM 2012), whose original resolution is 3-arc seconds. The first one is the data tile named N36E110 with 1201 rows by 1201 columns. It covers an area in China with of loess landform shaped by erosion processes of wind and water runoff, where the Yellow River runs through. The elevation ranges from 414 to 1803 m. The second data is located at upstream area of Ciliwung River, West Java of Indonesia with 1000 rows by 1200 columns. It covers several eroded volcanic mountains. The elevation ranges from 5 to 3006 m. Figure 8 shows the terrain shading maps of the two testing datasets.

Fig. 8
figure 8

Two test DEM datasets. The sub-watershed with the red boundary in test dataset 1 is to be highlighted; a Test dataset 1; b Test dataset 2

In order to perform an evaluation of the results of the new algorithm, the flow routing algorithm and the new algorithm are implemented and applied with different accumulation threshold values (T) on the two testing datasets. The extracted drainage segments are organized into networks labeled with Strahler orders using the procedure showed in Fig. 2. The amount of drainage segments at different Strahler orders is listed in Tables 1 and 2. The rows in gray background are results from the new algorithm.

Table 1 Result of drainage networks extracted from dataset 1. T denotes the threshold value for accumulation filtering
Table 2 Result of drainage networks extracted from dataset 1. T denotes the threshold value for accumulation filtering

The statistics of extracted drainage networks in Tables 1 and 2 shows that threshold values have heavy influence on results of flow routing algorithm. The total number and the length of drainage segments, the Strahler ordering levels and the numbers of segments on each level are changed remarkably when threshold values are different. But the results from our algorithm are very consistent and stable concerning all indicators. The fluctuation at lower Strahler orders is due to the fact, that lower threshold values identify non-valley cells at low and flat valley bottom.

A sub-watershed in the first dataset (red boundary in Fig. 8a) is extracted out for a more careful comparison. Figure 9 shows results of the flow routing algorithm and the new algorithm. Figure 9d shows drainage segments are identified at every contour bends. The numbers of drainage segments at all Strahler orders are listed in Table 3. The corresponding bifurcation ratios are computed. It is apparent that the accumulation threshold values have large influence to this indicator in the flow routing algorithm. The ratio based on the new algorithm is stable to the changes of the threshold values.

Fig. 9
figure 9

The extracted drainage networks from ordinary algorithm and the new one. Red, Pixels in blue, yellow, green and pink are in 1st, 2nd, 3rd and 4th orders. Solid black curves are contour lines. a Drainage network at T=50. b Drainage network at T=100. c Drainage network of new algorithm. d Overlay of drainage with contour lines

Table 3 Result of drainage networks of a sub-watershed

5 Conclusion and Discussion

This chapter reviews the existing algorithms of drainage network extraction and finds that there are no eligible algorithms, which can produce complete and reasonable drainage networks from gridded DEMs. Based on analysis of inherent rationale of available research, we proposed and implemented a new method that integrates flow routing algorithm and revised local morphometric algorithm. A rule-based cleaning process is designed to evaluate and eliminate problematic and noise segments. Two example datasets of different terrain types were used to test the proposed method. The quantitative analysis of extracted drainage networks shows that results by this new algorithm are insensitive to threshold values and stable in hierarchical structures.

Further possible development based our work can be on different aspects. Multiple-flow-directions algorithms are more appropriate to simulate water diverging models in microtopography, especially as high resolution DEMs are more and more available from wide applications of LiDAR and low altitude aerial stereo images. Also in this scenario, as natural sinks or depressions are widespread, adaptable pit removal algorithm should be implemented by integration of filling and carving based on identifying objects that break flow routing over terrain surface. In low and flat downstream areas, there are many efforts to design reasonable treatment algorithms to generate flow routing. An alternative and more appropriate way is to use hydrological features and water bodies from large scale topographic maps and geospatial databases as an a priori knowledge, which are then incorporated to reduce problematic segments during or subsequent to drainage networks extraction process. At the same time, a geospatial database can be enriched by incorporating the extracted drainage information (flow direction, flow accumulation and watershed subdivision). Another work is to design an appropriate generalization method to simplify the very dense drainage networks of the new algorithm for various applications.