Introduction

A typology is a classification or ordering of units to assist understanding and interpreting differences and similarities among units (Dahl et al. 2007). It conveys the general characteristics of a type to all units in the type. In an applied context, a typology can be a tool for conservation by dividing a resource into classes that are expected to respond similarly to conservation action. Knowledge applicable to a unit in a type can be applied to other units of the same type, possibly streamlining monitoring programs and facilitating development of across-the-board conservation strategies. Thus, a typology may be used to develop conservation strategies even for units for which there is no or limited monitoring data. Furthermore, typologies provide the conceptual framework needed to view a resource at a scale larger than the single unit and provide insight into conservation objectives and actions that may emerge only when the units are organized at the larger scale.

Typologies are especially relevant when working with many units. Specifically, the alluvial valley of the Mississippi River (MAV) is an extensive geographical area harboring hundreds of lakes created by recurring fluvial dynamics. The lakes are scattered throughout the valley and carved over thousands of years by shifting river courses and other hydro-fluvial processes associated with contemporary and prehistoric rivers (Fisk 1944; Saucier 1994). Lakes in the MAV have significant ecological importance as their diversity supports a large component of North American biodiversity. The diverse landscapes and community types on the MAV provide extraordinary habitat for a range of fauna including about 40 species of mussels, 45 species of reptiles and amphibians, 50 species of mammals, and about 60% of all bird species in the contiguous United States (Brown et al. 2000). Moreover, nearly 100 fish species are documented in lakes and associated backwaters (Baker et al. 1991; Dembkowski and Miranda 2014). Because of their geological and ecological importance, coupled with their primeval scenery, the hundreds of lakes in the valley have been branded “a national treasure” (Miranda 2016). Despite their similar origins, these lakes display a diversity of morphologies and successional stages representative of the natural evolution from aquatic to forested wetlands (Huffman et al. 1982; Saucier 1994; Biedenharn et al. 2018). Additionally, over the last two centuries deforestation of about three quarters of the MAV and leveeing to reduce flooding in support of agriculture development (Gardiner and Oliver 2005) have affected lakes in diverse ways. A central challenge in conservation of this extensive and fragmented natural resource is developing conservation strategies for the entire valley based on sparse information.

One way to address this complexity is to create a lake typology. A typology that describes characteristics of the different types of lakes in the MAV can be of practical value by aiding in the planning and targeting of conservation, research, and management action in support of abiotic and biotic resources (Higgins et al. 2005). A typology that catalogues lake descriptors relevant to biotic interests may allow conservation focused on habitat or on species protection and restoration to efficiently identify targets, even when local monitoring data are absent, or when lakes are not accessible because of private ownership. Perhaps a central application of a typology may be to reshape conservation into a broader administrative framework. Overall abiotic and biotic diversity may be best administered by conservation and restoration of whole segments of the valley rather than single units. To this end, our objectives were to (1) catalog the current extent and distribution of permanent lakes in the MAV; (2) characterize lake morphology, hydrology, and landscape position; and (3) develop a typology that assembles lakes into types relevant to aquatic conservation, restoration, and management of biotic communities.

Methods

Study area

The MAV extends across 10 million ha (an area larger than Portugal) from the confluence of the Upper Mississippi and Ohio rivers to the Gulf of Mexico (Fig. 1). Representing the historical floodplain of the lower Mississippi River, the alluvial valley has a low gradient of 0.1 m/km (Saucier 1994) and was formerly characterized by extensive tracts of bottomland hardwood forests and one of the largest contiguous wetlands in North America (Wiken et al. 2011). By the late twentieth century over 74% of the natural vegetation of the MAV had been cleared in support of agriculture (Gardiner and Oliver 2005) and hydrologic modifications including levee construction, stream channelization, and ditching to remove surface water had produced long-term declines in the extent and condition of wetlands (Baker et al. 1991; Wiken et al. 2011). The Mississippi River south of Baton Rouge, Louisiana was excluded from this inventory because historically the river has had higher natural levees and lower rates of channel migration that produce substantial differences in the genesis and character of fluvial systems (Saucier 1994).

Fig. 1
figure 1

The alluvial valley of the Mississippi River (shaded in gray) between latitudes 37.3° N and 30.4° N, about 770 km, north to south. At its widest the valley stretches almost 200 km, east to west. The left panel shows the distribution of the major rivers that flow through the valley; the right panel shows the distribution of 1329 lakes. The lakes shown range in area from 3 to 5602 ha and consistently retain some wetted area. Other lakes exist but were excluded from analyses because they dry-up seasonally or irregularly and represent ecosystems transitional between lakes and wetlands

Data sources

Large-scale GIS data on ecologically significant variables are available from various U.S. federal agencies. Boundaries of the MAV were delineated based on existing U.S. Environmental Protection Agency (USEPA) ecoregion classifications. The USEPA Level III Ecoregion delineates the MAV with margins that conform to standards accepted by federal and state agencies. Aquatic features from which natural lakes and other small water bodies can be extracted are mapped in the National Hydrography Dataset (NHD) available from the U.S. Geological Survey (USGS). The NHD also identifies various connections including artificial canals and subsurface pipes and provides spatial information for mapped objects. Estimates for relative inundation frequency and system connectivity have been mapped by the Gulf Coastal Plains & Ozarks Landscape Conservation Cooperative (Allen 2016). Data on land-cover distribution are available from the USGS National Landcover Database (2011), and data on agriculture and agricultural features are available from the U.S. Department of Agriculture (USDA) cropland database.

Lake identification

Floodplain lakes have diverse origins (Saucier 1994) and exhibit a continuum of successional stages that complicate the process of defining a lake. Sloughs and seasonal backswamps present special challenges, with delineations between lakes and wetlands at risk of becoming arbitrary. For the purposes of this study, lakes were limited to water bodies having permanent surface water, regardless of extent of seasonal flooding. Moreover, permanent water bodies had to be linked directly or indirectly to a stream channel, with minimal evidence of permanent through-system flow, and not identified as an artificial impoundment, aquaculture pond, or borrow pit.

To identify lakes based on these criteria we selected NHD lake/pond polygons permanently covered with water which were classified as “wet” based on Allen (2016). Aquatic features that intersected cultivated crops (USGS National Landcover Database 2011) do not represent natural water bodies and were eliminated. Low stage pools adjacent to river channels were eliminated by clipping all waterbodies within a 1-km buffer of NHD flowlines. This clipping carries the risk of eliminating real lakes adjacent to rivers and requires a manual review of features along main courses, which was also necessary to identify lakes falsely classified as a river course due to the existence of narrow connections at both ends or to seasonal flood-based connectivity. This set of polygon features was then further subset based on the minimum area of observed recurrent surface water as measured with the inundation index (Allen 2016). The inundation index reflects the percentage of time in which a pixel was observed to be wet based on Landsat images captured from December to April 1983–2011 (Allen 2016) and may be thought of as long-term flood pulsing. An index score of 100 means the pixel was classified as wet in all images, while a value of zero implies the pixel was not classified as wet in any images. Polygon features were only included in our analysis if they had at least 2 ha of wetted area (i.e., 22 pixels) with an inundation index value of 100. Thus, our analysis was limited to only those lakes which consistently held water. The NHD polygon features which met these criteria were then manually validated to ensure they were not artificial reservoirs, moist soil units, or other artificially managed water bodies not of interest to this study. The above sequence of steps provided an objective methodology for determining permanent open water compared with existing NHD attributes of “perennial” and “intermittent.”

Lake descriptors

Variables descriptive of lake geomorphology and hydrology were selected based on their relevance to biotic communities in aquatic systems (Table 1). Previous research on floodplain lakes in the MAV has suggested a hierarchy in lake descriptors based on physical, chemical, and biological characteristics. Characteristics such as lake area, elongation, depth, surrounding landuse, connectivity to rivers, and inundation index are first-order determinants of fish community assemblages (Dembkowski and Miranda 2012; Andrews et al. 2014; Miranda et al. 2014). These first-order physical determinants influence second-order chemical determinants such as water quality and nutrients. Both first and second order determinants dictate third-order lake biotic characteristics (Dembkowski and Miranda 2012). We used existing geospatial data to extract lake descriptors that focused on first-order determinants (Table 1).

Table 1 Lake descriptors used to represent 1329 permanent floodplain lakes in the Mississippi Alluvial Valley

Maximum depth is a key first-order determinant of fish community composition (Miranda 2011; Dembkowski and Miranda 2012) but is rarely available on a landscape scale. A remote sensing surrogate for depth was developed for this study using multiple observations of Landsat imagery and a dataset of lakes with known maximum depth. Google Earth Engine (Gorelick et al. 2017) was used to select and process cloud-free Landsat imagery (Surface Reflectance, Tier 1 Collection) acquired during July–September from 1985 to 2016. The entire MAV study area is covered by nine Landsat scenes and an individual Landsat scene covers approximately 185 × 185 km. For each scene, 20–43 images satisfied temporal and environmental requirements for a total of 297 images over the entire MAV. Open water extent was delineated using methodology described in Allen (2016). Relative patterns of turbidity in open water across all scenes were calculated as the per-pixel mean value of the red band (630–690 nm) over all images. Reflectance in the 630–690 nm range is commonly included in models to discriminate patterns of turbidity (Kloiber et al. 2002; Dogliotti et al. 2015) and has been found to accurately characterize turbidity in lakes within the study area including Moon Lake (Ritchie et al. 1987) and Lake Chicot (Harrington et al. 1992). Median summer turbidity values were assigned to each lake polygon based on mean turbidity for all pixels in the lake polygon area. According to change-point analyses reported by Miranda et al. (2017) ecological conditions in lakes of the MAV exhibit a tipping point at 1.2–2 m around which the lake system transitions from one state to another. A logistic regression model was developed with the known-depth lakes to predict the probability of a lake having maximum depth > 2 m (Pdepth > 2 m) from the remote sensed median summer turbidity. The model was then applied to all lakes to predict Pdepth > 2 m.

We also used the U.S. Census Bureau TIGER/Line Files (U.S. Census Bureau 2017), without any distinction of road class to estimate access to these lakes. Access was defined as the presence/absence of a road within 200 m from the lake. We applied stepwise logistic regression analysis to investigate if the variables defined in Table 2 were associated with presence/absence of a road.

Table 2 Characteristics of 1,329 lakes in the Mississippi Alluvial Valley

Lake similarities and clustering

Similarities among lakes were investigated with nonmetric multidimensional scaling (nMDS) and with nonhierarchical (i.e., partitional) k-means clustering applied to a select group of six lake descriptors. The nonparametric nMDS ordination visually illustrated the similarity distribution of lakes relative to the six descriptors in three-dimensional space and was implemented with the PRIMER version 7 package (Clarke et al. 2014). Clustering of lakes into groups was accomplished with nonhierarchical (i.e., partitional) k-means clustering. The nonparametric clustering procedure sought to minimize within-group sums of squares about a k number of group centroids. Even though k-means has been in existence since the 1950s and many other clustering procedures have been developed since then, it is still the most popular algorithms for clustering, likely because of its efficiency and empirical success (Jain 2010). The procedure was implemented with k-R clustering in the PRIMER version 7 package (Clarke et al. 2014), a routine that systematically searches for the optimal number of clusters by applying the SIMPROF permutation test (Clarke and Gorley 2015). This test compares the similarity profile (Euclidean distance) of the real data matrix against a set of 999 simulated similarity profiles generated by random permutation of the values of variables included in the clustering. Under the null hypothesis of no difference, an R-test was expected to find the real profile equal to the 999 artificial profiles in more than 5% of the comparisons.

The six lake descriptors included length, width, Pdepth > 2 m, inundation index, disconnection, and surrounding wetland/forest cover (Table 1). The remaining descriptors assembled (e.g., area, length/width ratio, agriculture cover) were excluded from the cluster analysis because they were correlated with the descriptors included (rs >|0.4|). In cluster analysis collinearity among variables tends to overly emphasize variables, so this problem was mostly avoided. Moreover, location within the valley (e.g., latitude, longitude, in/out of batture) was excluded to avoid lake clusters that differ mostly in geography and may not necessarily require different conservation planning. Nevertheless, variables excluded from cluster analysis were used after the clusters were created to further characterize the resulting clusters.

Results

We identified 1,329 lakes that retain continually at least 2-ha of flooded area (Fig. 1). These lakes totaled over 100,000 ha in surface area and over 10,000 km in shoreline length and had a broad diversity of characteristics (Table 2). Approximately 70% of the lakes by number and 60% of the lakes by area occurred west of the Mississippi River. Lakes were distributed over seven states including Arkansas (599 lakes; 32,889 total hectares), Louisiana (344; 45,392), Mississippi (340; 29,447), Tennessee (73; 8,820), Missouri (25; 4,030), Kentucky (14; 291), and Illinois (8; 978). Note that the state lake counts, and areas add up to more than the MAV total because 74 lakes are shared by states and were counted in more than one state. The number and area of lakes were split about evenly inside and outside the MAV system of levees.

A set of 89 lakes with known maximum depth was available to develop the predictive logistic regression model to estimate the probability of a lake having a maximum depth > 2 m. The set of lakes ranged in maximum depth from 0.5 to 19.8 m and averaged 4.1 m. In this set 74% of the lakes were deeper than 2 m. According to the logistic model, the log of the odds of a lake being deeper than 2 m was negatively related to the log of median summer turbidity. In other words, the clearer the lake, the more likely it is to be deeper than 2 m. The Wald χ2 test (χ2 = 19.5; P < 0.01) indicated that the logistic model provided a better fit to the data than the intercept-only model (i.e., null model). The concordance statistic C was 0.77 (C can range 0.5–1 with 0.7 indicating a good model).

The nMDS ordination suggested that the 1,329 lakes exhibited a continuum of characteristics in multidimensional space (Fig. 2) and were isolated into an artificial optimal arrangement of 12 clusters (i.e., types) by the permutation test. We note that Fig. 2 shows substantial overlap among clusters, but this overlap is reduced as additional axes are included in the ordination.

Fig. 2
figure 2

Nonmetric multidimensional scaling ordination of length, width, Pdepth > 2 m, inundation index, disconnection, and surrounding wetland/forest cover (defined in Table 1) over 1,329 lakes in the alluvial valley of the Mississippi River. Each circle represents a lake color coded by type and distances between circles represents relative similarity. The thicker frames of the 3-dimensional ordination next to each panel isolate the view shown by each panel. The 3-dimension stress is 0.16. The online version of this article shows color

Cluster designations for the 1,329 lakes, along with key lake characteristics, are available online (https://gcpolcc.databasin.org/maps/0d5517c361df4a178e48d6f1101f2e3b/active). Area was excluded from clustering because it was correlated with lake length and width which were included in the analysis, but area is perhaps the lake attribute most immediately evident, and therefore deemed a key characteristic for a classification system. Thus, for practicality the 12 types were separated into four conglomerates based on median lake area, including extra-small, small, medium, and large (Fig. 3). In the extra-small conglomerate, the median area of three types did not exceed 10 ha. In the small conglomerate median area of four types varied from 15 to 25 ha. In the medium conglomerate median area of three types varied from 35 to 45 ha. In the large conglomerate median area of two types exceeded 225 ha. The three extra-small types included 30% of the lakes and 5% of the total area; the four small types included 37% of the lakes and 15% of the total area; the three medium types included 25% of the lakes and 20% of the total area; the two large types included 8% of the lakes and 60% of the total area.

Fig. 3
figure 3

Characteristics (defined in Table 1) of 12 lake types defined in the alluvial valley of the Mississippi River. The filled circles represent the 50th percentile and the whiskers the 5th and 95th percentiles. The extra small (xS), small (S), medium (M), and large (L) type conglomerates are identified by background shading

Extra-small lakes

Lakes in the extra-small conglomerate have similar length and width dimensions and differed mostly in depth, disconnection, surrounding land cover, and distribution in or out of the batture (Fig. 3). Lakes of types 1 and 3 occur mostly within the batture, although those of type 3 are surrounded by substantially more wetland/forest cover than those of type 1 (lakes of type 3 had the highest percentage of wetland/forest of all 12 types). Lakes of type 2 occurred principally outside the batture, were surrounded largely by agriculture, and were shallower and more disconnected than lakes of types 1 and 3.

Small lakes

Lakes of types 4 and 7 occurred mostly outside the batture, included some of the most disconnected lakes in the valley, and were surrounded mostly by agriculture and very few wetlands (Fig. 3). Lakes of type 4 differed from those of type 7 in that the former were wider and shorter, and therefore had lower length/width ratios compared to the more slender and elongated lakes of type 7. In fact, lakes of type 7 included some of the narrowest and elongated lakes in the MAV. Lakes of types 5 and 6 occurred mostly within the batture and were more connected than lakes of types 4 and 7. Lakes of type 6 differed from those of type 5 in that they were surrounded by substantially more wetland/forest cover, low agriculture, and tended to be more disconnected and deeper.

Medium lakes

Relative to lakes of types 9 and 10, the medium lakes of type 8 occurred mostly outside the batture, were shallower, with low inundation frequency index, more disconnected, and surrounded by agriculture (Fig. 3). Lakes of types 9 and 10 differed in that those of type 9 were longer and narrower. In contrast, lakes of type 10 fluctuated less in area as they had higher inundation frequency index than those of type 9.

Large lakes

The two lake types in the large conglomerate included some of the longest, widest, and deepest lakes in the MAV, but not necessarily the most elongated as suggested by average length/width ratios (Fig. 3). Lakes of type 12 were deeper and considerably more stable in depth and area than those in any other type, as suggested by the high median Pdepth > 2 m and high inundation index. Conversely, lakes of type 11 had some of the lowest inundation frequency indexes of all 12 types, suggesting these lakes have dramatic flood pulsing cycles, annual or longer term. These fluctuations could be due to cycles of flooding, desiccation, or both. More of the lakes of type 11 occur within the batture than lakes of type 12. Moreover, lakes of type 11 tend to have more wetland/forest cover and less agriculture cover than those of type 12.

Geography of lake types

In general the 12 types occupied broad expanses of the MAV and showed no compelling geographical distribution patterns that would associate them with restricted landscape coordinates. However, lakes of four types (i.e., 2, 4, 7, and 8) tended to have sprawling distributions. The remaining eight types occurred throughout the MAV but are concentrated near rivers. Lakes of types 11 and 12 occur principally along the Mississippi River, with some along the Arkansas River.

Access

Road access was available to 74% of the lakes and varied from less than 50% for lakes of type 3 to 100% for lakes of type 12 (Fig. 4). Overall access was least available for types in the extra-small conglomerate and most available for types in the large lakes conglomerate. Moreover, within conglomerates, there was a consistent inverse relationship between access and extent of wetlands and forests surrounding lakes. The stepwise logistic regression verified that access increased with lake size (P < 0.01) and decreased with percentage wetlands and forest adjoining the lake (P < 0.01). The Wald χ2 test (χ2 = 164.7; P < 0.01) for this multiple-variable logistic model suggested that it provided a better fit to the data than the intercept-only model. The concordance statistic C was 0.77 indicating that 77% of the observations were correctly classified by the model.

Fig. 4
figure 4

Percentage of lakes with road access within 200 m, according to lake types. The extra small (xS), small (S), medium (M), and large (L) type conglomerates are identified by background shading

Discussion

Freshwater ecosystems provide irreplaceable services such as nutrient removal, flood regulation, water supply, and fishery production. The quality of freshwater ecosystems affects biogeochemical dynamics and ecological processes that determine biodiversity at local, regional, and global scales. Floodplain lakes and their associated riparian habitats in the MAV are amongst the most biologically diverse in North America (Brown et al. 2000) and have inestimable historical and cultural values (Smith 1988). In aggregate, and in association with contiguous wetlands, permanent floodplain lakes in the MAV represent a unique natural resource that provides numerous ecological, economic, and socio-cultural services (Jenkins et al. 2010). Anthropogenic alterations in the valley including leveeing and land clearing have dramatically impacted many of these lakes and stripped critical natural resources from current and future generations. Conservation of permanent floodplain lakes in the MAV has been limited due to factors such as the extensity of the valley, accessibility to lakes, absent large-scale organization and cooperation, and an overall lack of knowledge about this natural resource.

Mitigation of further degradation in the MAV is dependent partly on a strategic approach to conservation assisted by large-scale inventories. Technology in the form of remote sensing has now made large-scale inventories possible. Our inventory is a first step towards developing a large-scale conservation program of floodplain lakes in the MAV, and the typology resulting from this inventory represents the first ever classification system for permanent floodplain lakes. The inventory reveals that lakes are very diverse relative to first-order determinants (Dembkowski and Miranda 2012). The 12-types typology splits lakes according to their geomorphology, connectivity to major rivers, surrounding landuse, and recurrent inundation. Approximately 68% of the lakes were classified as small or extra-small, and 8% as large. The small percentage of large lakes is compensated for by their extent that, when combined, accounts for over 60% in the total lake area. Other lakes exist in the MAV but were excluded from our typology because they dry-up seasonally or irregularly and represent mostly ecosystems transitional between lakes and wetlands. The lakes excluded are typically small, shallow, many are ancient, and poorly connected to stream channels.

Our typology was generated through a statistical analysis that partitioned what is a continuum in multidimensional space into an artificial arrangement of lake types. This simplification was sought to facilitate two aims. First the resulting typology could serve as a means of developing goals and objectives for lake monitoring and conservation efforts in the MAV. This aim is simplified if all-purpose conservation plans can be developed according to a few lake types. Second, the typology could serve to move the attention from an almost exclusive emphasis on lake-by-lake management towards a more holistic consideration of lakes as types, each with their own unique qualities, but mostly replicated throughout the MAV. Our lake typology represents the framework needed for thinking of lakes as pieces of a larger and interactive system. Considering lakes as types can generate conservation insight not evident when considering them as isolated units by forcing a focus on the fundamental differences among lake types. Conversely, if conservation strategy is successful in achieving objectives in a single lake, the strategy may generalize to lakes of the same type.

Our classification system exposes the need for further research. Previous investigations into the water quality, fish, and fisheries of floodplain lakes in the MAV have indicated major differences in ecological characteristics among lakes. Studies have shown that lake geomorphological attributes such as area, depth, and connectivity tempered by landcover and riparian composition control lake water quality, fish assemblage diversity, and fish assemblage composition (Miranda 2011; Dembkowski and Miranda 2012, 2014; Miranda et al. 2014; Goetz et al. 2015). Also, fishery characteristics such as angler catch, expectations, and participation reportedly depend on lake size (Miranda et al. 2018). Therefore, we anticipate that biotic characteristics will differ among the 12 lake types. However, large-scale systematic analyses of biotic assemblages in floodplain lakes of the MAV are absent. Additional research is needed to define their ecological conditions. Moreover, the lake types may have diverse origins (e.g., abandoned channels, topographic depressions, crevasses, sloughs) that affect their geomorphology, sediment, and hydrologic pathways that can further define and explain the typology. Our typology can provide a framework essential to organizing the investigations needed to define origin, water dynamics, water quality, and ecological conditions of biotic communities such as riparian forests, mussels, fish, and avian communities to construct conservation plans applicable to different lake types.

Conclusion

The floodplain lake typology assembled for the MAV provides a large-scale appreciation of the properties of permanent floodplain lakes. It is a functional tool that can be used to identify conservation and research needs, adapt monitoring and management programs, customize environmental programs, and use conservation resources more effectively to achieve large-scale management objectives. Instead of a one-size-fits-all program aimed at addressing all complications associated with floodplain lakes in the MAV, resource managers can focus monitoring and conservation programs on problems and opportunities linked to each lake type. A conservation organization can develop management plans and budgets based on a clearer concept of the abiotic characteristics of specific lake types. Principles found to be applicable to a unit of a certain type may be applicable to all units of the same type, facilitating development of standardized monitoring and conservation programs. The typology can also help identify understudied lake systems in need of additional biotic characterization. The lake types can guide design of research and management programs that match the demands of the aquatic environment, potentially increasing conservation effectiveness. Thus, our floodplain lakes classification represents a decision-making tool that can contribute to administration of floodplain lakes conservation. While currently our methodology may not be directly transferrable globally due to lack of suitable geospatial data sets, the concept of creating typologies based on local needs and available data to facilitate management is transferrable as the floodplains of major rivers worldwide have similar lake distributions and conservation problems.