Keywords

1 Introduction

Digital Soil Mapping (DSM) is the term given to producing predictive soil maps by the use of mathematical models applied to field observations of soils and synoptic layers related to soil formation and distribution (McBratney et al. 2003); this was termed “predictive soil mapping” by Scull et al. (2003), which is perhaps not such a suitable term, since all soil mapping is predictive of what the map user will encounter on the landscape. The “digital” in DSM is a byproduct of advances in technology and is meant to replace or extend the inductive reasoning of the expert soil mapper. The idea is to predict soil types, diagnostic features, or properties over a landscape, based on a set of observations and a set of environmental covariates covering the area to be mapped. These covariates are supposed to be either proxies for soil-forming factors, or sometimes simply empirically related to the property, feature or class to be mapped.

We restrict attention here to so-called ‘scorpan’-based DSM, that is, where covariates are chosen to represent climate (‘c’), organisms (‘o’), relief (‘r’), parent material (‘p’), and time (‘a’, replacing Jenny’s ‘t’); known soils (‘s’) are used for calibration and neighborhood (‘n’) relations (i.e., local spatial correlation) may be used. In practice, ‘r’ (terrain) and ‘o’ as represented by vegetation indices, surface reflectance or land use maps are the most widely-used covariates. The attraction of DSM is easy to understand large areas can be covered with reduced field survey, the uncertainty shows the reliability of the map, and the models behind the predictions can be made explicit, often providing insight into soil geography. This is in contrast to previous approaches, which relied on the mapper’s mental model of the soil landscape, spatialized by manual interpretation of aerial photographs (Farshad et al. 2013). The geopedological approach of Zinck (2013) is the most theoretically-sound of these methods, because it is based on a systematic hierarchical soil-landscape analysis, not an ad hoc partitioning of the landscape based on perceived homogeneity.

Most digitally-produced soil maps are of single properties, notably soil organic C and particle-size distribution, often showing the depth distribution (e.g., Liu et al. 2013) as specified by the GlobalSoilMap.net project (Arrouays et al. 2014). Empirical methods based on point observations and correlation with spatially-complete covariates is well-suited for such mapping, although it provides no insight into soil geomorphology. By contrast, the geopedological approach considers soil as a natural body with its own history, ecology, function and, importantly, spatial relation with other bodies. The actual soils form clusters in the very large potential space formed by each attribute taken separately, and the soil function can only be appreciated as a whole, much greater than the sum of its parts. Maps of these clusters, i.e., soil types, can then be interpreted for multiple uses, and in addition they form a sound basis for stratification in the mapping of single properties. Thus, here we restrict our attention to DSM efforts to map soil types, as in the geopedological approach.

Some DSM approaches start from existing maps, which implicitly contain rich geopedological knowledge, and use digital methods to refine or update them (e.g., Kempen et al. 2009; Yang et al. 2011), and even attempt to disaggregate existing maps to a finer scale, e.g., the DSMART approach (Odgers et al. 2014) used in the POLARIS gridded maps of the continental USA (Chaney et al. 2019). The SoLIM (Soil-Landscape Inference Model) approach (Zhu et al. 2001) reasons by analogy from known soil-landscape relations. This requires either a pre-existing map of soil types or expert knowledge of where each type occurs on the landscape. Here we only consider the case where there is no existing soil-landscape map, only some point observations (usually purposive or opportunistic, not a probability sample) and a set of whole-field covariates, i.e., the common ‘scorpan’ approach.

DSM is the obvious soil mapping counterpart to similar data-driven approaches to knowledge in this computer age. Most current DSM models rely on terrain, climate and vegetation intensity covariates. These are easy to obtain; see for example Hengl (2013). They can be used as proxies for soil forming factors, thus they are related via pedogenesis to many soil properties, and from the assemblage of these properties to a soil type. An early statement of the hope of the digital soil mapper is from Zhu et al. (1996): “We assume that every soil series occurs under one or more typical environmental configurations or ‘niches’ and has a typical set of soil properties... and can be characterized by a vector of environmental parameters in an m-dimensional parameter space”.

Why is the ‘scorpan’ approach not always successful? Fundamentally, there is much more to soil formation than the current environment. In particular, the soil forming factor ‘time’ is only approximately represented by landscape position, and the factor ‘parent material’ does not always have a close relation to topography. As early as 1935 Milne (1935) recognized that some east African toposequences (his ‘catenas’) developed on uniform parent rock, others on sequences of outcropping rocks. A direct correlation between soil types and slope positions was thus not possible. Variations in parent material (in the absence of a detailed surficial geology map) and the short time-scale of covariates compared with the time-scale of soil formation result in models that do not fully characterize the soil cover.

Another problem with the empirical ‘scorpan’ approach is that soils often have inherited much of their current characteristics from previous climates and the associated vegetation, and indeed they may be the result of multiple cycles of soil formation, as evidenced by stone lines, lithologic discontinuities or landscape inversion resulting from cycles of erosion and/or deposition. In younger landscapes, the topography may be relict from recent disruptions such as glaciation or vulcanism.

A final major problem with the ‘scorpan’ approach is that soil bodies often have a spatial relation, where materials have been transported from one body to form another. Examples are alluvial fans and colluvial deposits. These do have a landscape position and morphology (accounted for with the ‘r’ factor) but in addition inherent their parent material (‘p’ factor) from adjacent units.

To date DSM has been almost exclusively empirical: a statistical relation is established between the observations and covariates, and this relation is then applied across the area to be mapped. Soil property DSM has used, among others, geostatistics (e.g., Kriging with External Drift), multiple regression, random forest regression and similar “machine learning” methods, generalized additive models (GAM). Soil feature or class mapping has used multiple logistic regression (e.g., Abbaszadeh Afshar et al. 2018), similarity in feature space (e.g., Zhu et al. 2015), or random forest classification. A good review of the various DSM methods for soil classes is by Heung et al. (2016).

In this chapter I give some examples where ‘scorpan’-based DSM of soil classes based on the usual covariates will fail, but where geomorphic analysis results in successful landscape stratification, within which field observations can be placed, will produce a reliable map. We consider four examples: exhumed paleosols, depositional low-relief environments, inverted landscapes, and young post-glacial landscapes. The last example is explained in detail.

A separate issues is the complex and contingent nature of pedogenesis as evolution with continuously-varying environmental conditions (Phillips 2001; Huggett 1998); this suggests that there is a chaotic, non-deterministic element to pedogenesis that cannot be inferred from observations of soils in similar niches. This is outside the scope of this chapter.

2 Example 1: Exhumed Paleosols

Exhumed paleosols are soils, now at the surface or covered by a thin mantle of newer material, which developed under a different climate than the present. They were then buried by new deposits, e.g., by a younger glacial till or loess, but then by landscape evolution (dissection, down-wasting) exposed again at the surface. Their soil properties are largely controlled by conditions in the past, although of course now subject to current conditions for further evolution. A classic study is from Ruhe et al. (1967), who identified various glacial till, loess and paleosol layers from four glacial and three interglacial stages in Iowa (USA). A detailed geomorphic investigation reveals, for example, relict fluvial surfaces (floodplain alluvium, slope fan alluvium) from the Sangamon interglacial which are now above the current base level where current fans and flood plains are located; further a relict pediment with stone line developed in Kansan till is mantled by a thin Wisconson loess layer, and on the interfluves a modern soil developed in the loess but overlying a ‘gumbotil’ layer, i.e., very clayey weathered Kansan till (Kay and Pearce 1920). Some late Wisconsin-Recent slopes have cut back to interfluves, and on these erosional slopes Yarmouth-Sangamon paleosols outcrop, with younger soils above and below. These exhumed paleosols may also be truncated, so the paleo-B horizons are now at the surface. Others have thin caps of loess or slope wash.

How could ‘scorpan’-based DSM deal with this area? The exhumed soils occupy a defined elevation range, but since this represents a relict dissected surface, there are several soil classes in this same position. By contrast, the geomorphic analysis of Ruhe explains the soil distribution and provides a key for mapping. In geopedological terms, the surfaces would be separated at the lithology level.

3 Example 2: Depositional Low-Relief Environments

Soils in depositional low-relief environments such as fluvial systems with rapidly-changing channels and variable infilling (e.g., the Rhine-Meuse delta of the Netherlands, see Berendsen 2005) cannot be mapped by interpolation, even with intensive boring campaigns, without geomorphic interpretation of the paleo-geography. Another example is the detailed study by of the alluvial and terrace soils associated with the Río Guarapiche in Monagas state, Venezuela (Zinck and Urriola 1971; Zinck 1987). From the geomorphology one can delineate various landscape components such as current and abandoned channels, backswamps, splay fans, and associate these with soil types. The relief is subtle. Vegetation differences can reveal some of the differences, but only in areas where there has been no artificial drainage.

How could ‘scorpan’-based DSM deal with this area? The landforms are quite similar, although backswamps may have slightly more concave shape. The elevation differences between terrace levels are quite small, and the absolute elevation decreases downstream, so that a single elevation range cannot be used to identify a terrace. Splay fans have the same elevation as backswamps but quite different soils.

4 Example 3: Inverted Landscapes

Pain and Ollier (1995) present a convincing argument that landscape inversion is a common form of landscape evolution. Ferricrete (‘laterite’, ‘plinthite’, ‘ironstone’) and duricrusts often form in lower landscape positions, and lava flows may preferentially follow pre-existing valleys. These materials are more resistent to erosion than their surroundings, and eventually end up as the highest landscape positions. An example is central Uganda (where Milne developed the catena concept), where thick ferricrete mesas are typically the highest landscape positions. During the inversion process continued weathering of saprolite and movement of materials and solutes along the slope have resulted in a complex soil landscape (Brown et al. 2004); see Fig. 9.1.

Fig. 9.1
A diagram of Buruli catena displays granite gniess bedrock, saprolite, soil, sand, and clay, termite mound, stone line, ferricrete, relic riverbed, the convex hilltop, the pediment slope, dambo, and sandy margins. The ferricrete mesas and eroded ferricrete mesas is shown in the top.

Conceptual diagram of the Buruli catena, central Uganda. (Fig. 2 in Brown et al. (2004), used by permission)

How could ‘scorpan’-based DSM deal with this area? If the positions within the catena are regular, the ‘r’ covariates elevation, slope gradient, curvature and wetness index could separate the soil types. This depends on (1) a limited area with a repeating landscape pattern, (2) training observations in all landscape positions. If this area is mapped as part of a larger area, perhaps including coordinates in a random forest DSM model might be able to “box” this area, within which the relation with indicated covariates would apply. As in the paleosols example, geomorphic analysis clearly explains the soil distribution and provides a key for mapping. In geopedological terms, the components of the catena would be separated at the landform level.

5 Example 4: Young Post-Glacial Landscapes

Large areas of northern North America and Europe are covered with soils developed in young post-glacial landscapes; smaller areas are from recent alpine glaciation. In these areas the geomorphology and distribution of parent materials can only be understood by means of the detailed history of glaciation and deglaciation (e.g., proglacial lakes, outwash plains, sandurs) which have only an indirect relation with terrain variables. We illustrate this with an example from Tompkins and Tioga counties, New York State (USA).

Figure 9.2 is a fragment of the USGS 7.5′ 1:24000 topographic map West Danby and Willseyville (NY) sheets. An analyst following the geopedological approach would use stereo-pairs of remotely-sensed images, e.g., airphotos, but even without stereo view the map clearly shows features that are immediately recognizable to a trained analyst familiar with the Pleistocene history of the region (Bloom 2018): (1) a terminal moraine of the Valley Heads stage, behind which are (2) hummocks and kettles from stagnating ice; (3) pro-moraine outwash terraces, breached on the E and NE margin by (4) post-retreat outflow channels which formed (5) outwash terraces transecting the end moraine; (6) truncated spurs and post-glacial incisions; (7) in the NE edge a high-level terrace formed above the moraine when it was blocking outflow; (8) high-level outflows from the main glacial tongue, when it was pressed up against the E margin; (9) post glacial fans from upland erosion; (10) a large kettle, now a shallow lake and swamp, in front of the centre of the moraine, corresponding to a large block of ice separated from the glacier.

Fig. 9.2
A topographic map displays the sheets of West Danby and Willseyville. Alphabets A to H and numbers from 1 to 10 are given in the map.

Fragment of the USGS 7.5′ 1:24,000 topographic map West Danby and Willseyville (NY) sheets. Annotations are geomorphic features (black numbers) and sites where soils are discussed (red letters); see text

Figure 9.3 shows the detailed soil survey of the same area, provided by the NRCS (USA) Web Soil Survey (http://websoilsurvey.sc.egov.usda.gov), here displayed on a Google Earth background by the SoilWeb application (O’Geen et al. 2017; California Soil Resource Lab n.d). Table 9.1 shows a tentative geopedologic legend for this area.

Fig. 9.3
A map displays the detailed soil survey areas in green color shades and alphabets from A to H are appears scattered in the map.

Detailed soil survey of the area shown in Figure 1, provided by the NRCS (USA) Web Soil Survey (http://websoilsurvey.sc.egov.usda.gov), displayed on a Google Earth background by the SoilWeb application (http://www.gelib.com/soilweb.htm). Annotations as in Fig. 1. See SoilWeb for map unit codes and descriptions

Table 9.1 Provisional geopedologic legend for Example 4

Referring to Fig. 9.3, we identify several situations where a DSM approach using the usual covariates will not work, but where geomorphic knowledge results in an easy landscape interpretation and soil mapping:

  • Positions A and B have identical slopes (flat), differ in elevation by less than a meter, are the same distance from streams, have almost the same wetness index, both are agricultural fields, yet the soils are quite different. A is mapped as the somewhat poorly-drained Middlebury (coarse-loamy Fluvaquentic Eutrochrepts) and well-drained Tioga series (coarse-loamy Dystric Fluventic Eutrochrepts), aggrading alluvial soils in silty and sandy alluvium from the present-day outlet of Michigan Creek, while B is mapped as the Howard series (loamy-skeletal Glossoboric Hapludalfs), a well-drained well-developed (considering the approximately 12 k years since the retreat of the glacier) gravelly loam from pro-glacial outwash, with about 30% rock fragments, mostly rounded cobbles of mixed origin. These soils differ considerably in age and lithology but cannot be separated by terrain covariates.

  • Positions C and D (two examples) have identical very steep slopes and slope shapes (straight), both well-vegetated with native hardwoods, yet the soils are radically different. C is again the Howard series but truncated by the modern outlet of Michigan Creek to expose an outcrop of gravelly glacial outwash, while D is mapped as the Lordstown series (coarse-loamy Typic Dystochrepts), a moderately deep to bedrock channery silt loam with about 20% large to medium rock fragments from Devonian shale and mudstone; on the steepest slopes the soils are probably in the shallow to bedrock Arnot series (loamy-skeletal Lithic Dystochrepts).

  • Positions E and F are adjacent, with similar terrain parameters, elevation and land use, but are easily recognized as a modern alluvial fan (9, E) and glacial outwash (10, F). Again, F is the Howard series; here E is mapped as Chenango (loamy-skeletal Typic Dystrudepts), a younger soil with periodic flash floods (e.g., due to hurricanes and rapid snowmelt in the contributing watershed) resulting in additions of subrounded poorly-sorted gravels (mudstone and sandstone) from the surrounding uplands.

  • Position G is especially interesting. It is at a high elevation, has moderately steep slopes, is in native forest vegetation, yet is also mapped as the Howard series, i.e., it is glacial outwash, not soil in residuum, e.g., the surrounding Lordstown soils with the same topography and vegetation. The geomorphic clue here is outside the figure: Michigan Hollow (seen entering on the NE) is a through valley where the original drainage divide, about 5 km N, was removed by the glacier; subsequently as that tongue melted a large amount of outwash was deposited in what was then a lake behind the terminal moraine (1). Apparently, there were two levels; the higher one (G) was subsequently easily eroded by upland runoff; the lower terrace (between G and C) remains almost flat. The incision at (8) is also explained by a period where the ice filled the valley (NW in the figure) so that meltwater had to follow this channel to produce some of the outwash (5). The W margin of this hill shows the same phenomenon but from when the ice had melted enough to allow water to flow along its margins at the base of the truncated spur.

  • Positions H and I differ by only 30 m elevation, are both flat, both with dense vegetation; yet while I is again mapped as Howard (glacial outwash), H is mapped as Typic and Terric Medisaprists, i.e., an organic soil. Geomorphically this is easy to understand: both positions are part of the kettle moraine (2). Some similar positions to I are mapped as Arkport (coarse-loamy Psammentic Hapludalfs); these are further behind the end moraine where meltwater was sandier.

Although ‘scorpan’-based DSM would not be able to find these differences, some other approaches might have some success. To do so, they would have to emulate the geopedologic interpretation. For example, it might be possible to identify post-glacial alluvial fans by their relative landscape position: where narrow steep side valleys emerge onto outwash plains. Also, their shape is diagnostic: narrow at the proximal (upstream) end, widening at the distal end. These might be revealed by a segmentation which then considered adjacency and oriented (proximal-distal) shape relations. However, the boundary between the fan and the outwash which it overlays (E vs. F) is quite subtle. Although visible to the geopedologist it seems difficult to delineate automatically. The difference between C and D might be revealed by total slope length and position on the slope.

Another covariate that might have some success is hyperspectral remote sensing, which might allow vegetation communities to be distinguished (e.g., between positions H and I). Differences in amount of weathering (factor ‘a’) can sometimes be inferred from aerial gamma-ray survey (Moonjun et al. 2017). However, the geopedologic legend and map give a holistic view of the soil landscape.

6 Discussion

6.1 What Could Be the Contribution of the Geopedologic Approach to DSM?

DSM methods are important additions to the soil mapper’s toolkit, especially when large areas need to be mapped, and when estimates of uncertainty are needed. In simple landscapes with close correlation between topographic parameters, land use and soil type, it has shown good success. This is especially true in smaller areas where many soil-forming factors are more or less constant, and only a few covariates are needed to separate the soil types or properties. An example is a landscape with a single lithology (‘p’) and climate (‘c’). Here toposequences (‘r’) correlate well with soil development (‘a’), and the land use (‘o’) can account for anthropogenic influences. But as Milne’s original catena shows, in many toposequences lithology is not constant, for example the ironstone caps of relict plateaus from a landscape inversion. These are easily identifiable and understandable for the geomorphologist.

However, as the above examples show, there are situations where a geomorphic understanding is necessary to identify locations where each soil type is expected. The only geopedological knowledge used in current DSM approaches is the selection of covariates to (presumably) represent soil-forming factors. Recently a set of challenges for the future of pedometrics (Wadoux et al. 2021) recognizes this limitation as one of its ten challenges: “Can we incorporate mechanistic pedological knowledge in digital soil mapping?” Ma et al. (2019) discusses this in detail. This is not exactly an appeal to geopedology, but “mechanistic” could be replaced by “expert”, i.e., geopedologic knowledge of soil-landscape relations, built explicitly into the DSM workflow.

One promising approach is so-called “contextual” DSM (Behrens et al. 2018). This uses a multiscale version of a digital elevation model (DEM) to derive a set of terrain derivative (factor ‘r’) at different scales. The relation to geopedology is that the coarser scales may in some cases correspond to the higher levels in the geopedologic hierarchy, from landscape (most general) through relief type to landform. A similar idea is the use of so-called “deep learning” in the form of convolutional neural networks (CNN) which use contextual information from the environmental covariates, i.e., from a hierarchical set of neighborhoods, not just the covariate values at the observation point (Padarian et al. 2019). These have not yet been applied to soil classes but do have the possibility to account for some adjacency or upstream-downstream relations.

Object-oriented image segmentation applied to stacks of terrain parameters (Dragut et al. 2009) offers a digital approach to discovering landscape units, which can perhaps be interpreted and correlated to soil types. However, several segments may have similar landscape parameters, yet be of contrasting origin, for example, alluvial terraces vs. glacial outwash terraces. Here the relative position in the landscape can be used to differentiate them. This requires concepts of adjacency and flow direction.

This begs the question as to whether geomorphology, as opposed to geomorphometry, can be digitally mapped (Bishop et al. 2012). If so, the digital geomorphic map could be used as a powerful covariate for digital soil mapping; perhaps geopedology would not be necessary. The most promising method is object-oriented analysis, followed by geomorphometric characterization (Hengl and Reuter 2008), leading, it is hoped, to interpretable terrain units. However, Bishop et al. are clear on the limitations: “Although this scale-dependent approach is conceptually pleasing, it is nonetheless fundamentally a cartographic approach to mapping that does not formally address issues of processes, internal and external forcing factors, feedback mechanisms and systems, or spatio-temporal dynamics.” In other words, geomorphology, and hence geopedology, is not simply terrain analysis, no matter how sophisticated. Evans (2012) has a similarly pessimistic view of the prospects for automated geomorphic mapping.

7 Conclusion

There are situations where neither DSM nor geopedology will be successful, and where intensive systematic field observation is the only way to map important soil differences. An example is given by Toomanian (2013) of a playa in the Zayandeh-rud valley, Iran, where a uniform surface is created by an aeolian mantle; this mantle covers a wide diversity of aeolian, lagoonal, and alluvial layers deposited during the Quaternary and Tertiary. The geomorphometry is uniform, the soil surface reflectance and vegetative cover as well. Although surface salinization can be detected, this is not related to important subsurface differences. There is no solution but to grid sample and interpolate. But for many soil landscape, the integration of geomorphic understanding and its relation to soil genesis allows successful mapping, where simple environmental correlation using ‘scorpan’ covariates as presumed proxies for soil-forming factors is not successful.