1 Introduction

Droppo et al. (2002) used the “urban continuum” concept to describe the movement and transformation dynamics of sediment within an urban environment, specifically focusing on surface washoff, transport through the drainage system, and deposition within a receiving water body. The endpoints of this urban continuum conceptual framework either were movement to a wastewater treatment plant or discharge to a receiving water body via a combined sewer overflow. Selection of these endpoints reflected the location of the study area (Hamilton, Ontario, Canada) which is serviced by a combined sewer system, as well as the prominent focus on combined sewer overflow (CSO) abatement programs in the USA and Canada at that time (US EPA 1995, 2001; Irvine et al. 1998; Zukovs and Marsalek 2004; Irvine et al. 2005a, b, c).

Although stormwater ponds were being included in new suburban residential designs from the 1970s to manage stormwater runoff quantity, it was not until the 1990s that designs evolved to focus on water quality improvement as well (Mayer et al. 1996). By the mid-2000s, the green infrastructure concepts of Low Impact Development (LID, North America), Water Sensitive Urban Design (WSUD, Australia), Sustainable Urban Drainage Systems (SUDS, Europe), and later, Sponge Cities (China) had gained traction as a naturalized alternative to hard engineering approaches for managing stormwater runoff and CSOs (Kok 2004; Fletcher et al. 2015; Ahmmad 2017; Lashford et al. 2019; Qiao et al. 2020). These green infrastructure technologies include bioswales, raingardens, green roofs, pervious pavement, rainwater harvesting, cleansing biotopes, floating wetlands, and constructed wetlands. More recently, the concept of nature-based solutions (NbS) has been included in the design and visioning approach to urban management (Hanson et al. 2020; Moosavi et al. 2021). Ruangpan et al. (2020) define NbS as …participatory, holistic, integrated approaches, using nature to enhance adaptive capacity, reduce hydro-meteorological risk, increase resilience, improve water quality, increase the opportunities for recreation, improve human well-being and health, enhance vegetation growth, and connect habitat and biodiversity. Accordingly, NbS considers a broader approach to urban environmental management, although LID, WSUD, SUDS, and Sponge City techniques clearly are related and might be considered components of NbS. The shift towards green infrastructure over the past 20 years provided the dual benefit of water quantity and water quality management, but this newer NbS definition offers even greater scope to consider multiple ecosystem service benefits, including urban heat island mitigation, carbon sequestration, food provisioning, biodiversity/enhanced habitat, and aesthetics. The general progression from stormwater ponds to WSUD to NbS and stylized topologies of each is summarized in Fig. 1.

Fig. 1
figure 1

Progression of stormwater design visions: a Simple stormwater pond, primarily for runoff quantity detention, Buffalo, NY, USA (photo by authors), 1970s–early 2000s; b bioretention cell (raingarden, Singapore) as an example of WSUD, 2010s–present, with substantial depth for surface storage and four substrate layers for additional storage and water quality treatment. Focus has been on individual features, as shown here, or distributed, small features; c NbS design (peri-urban Bangkok, Thailand), showing the Master Plan, including an integrated, holistic vision for community wellbeing and optimization of ecosystem services through integration of a stormwater wetlands, storage and polishing ponds, orchard storage and irrigation canals, community greywater treatment and storage ponds, and connecting drainage canals. Also shown is the community green space, greywater treatment and storage ponds, and adjacent single-family homes in both the Enlargement and Perspective visions (design by Prathana Reiyndara, LN316, Landscape Architectural Design 4 studio class, Thammasat University)

Given these newer green infrastructure trends in urban water management, the urban continuum concept of Droppo et al. (2002) might be revised to have a third endpoint, that of NbS features. Frantzeskaki (2019) concluded that successful NbS must be aesthetically appealing to citizens, creating a new green urban commons through social innovation and collaborative governance. This type of design approach is underscored in Fig. 1c. While NbS design is appealing in its connection to the new green urban commons concept, it is essential to fully integrate design and performance. Steinitz (2020), through his Framework for Theory and concept of geodesign, called for greater multidisciplinary interaction in this type of landscape and urban visioning: …we must learn to understand that almost everything we do to change the landscape by design requires collaboration, whether with architects at the smaller scale, urban designers at the middle scale, geographers at the larger scale, and with engineers at all scales, and lawyers and bankers and government officials, yet with no one losing his/her professional identity. Irvine et al. (2021a, b) examined how modeling approaches (including ecosystem services assessments) could facilitate bridging between the design and engineering community to optimize pluvial flood management, in essence, underlining Steinitz’s (2020) call for greater multidisciplinary interaction.

While keeping the design thinking of landscape architecture for NbS firmly in place, here we focus more on the technical issues of NbS design. The overarching objective of this paper, then, is to discuss current NbS research and design practices and identify gaps in our knowledge that may create barriers to effective implementation of NbS for management of particle-bound metals. This objective is addressed using a case study of two constructed wetlands in Geelong, Australia, in combination with an extended literature review and discussion. In keeping with the theme of this special volume, we also highlight the many contributions that Dr Ian Droppo has made to better our understanding of urban sediment dynamics.

2 Methods

2.1 Study sites

We conducted our study at two constructed wetlands in Geelong, Australia. Geelong is the second largest city (2021 population, 264,900) in the State of Victoria and is located approximately 75 km southwest of Melbourne (Fig. 2). While its proximity to Melbourne has promoted suburban development for commuters, Geelong has its own character and vision, being designated Australia’s first UNESCO City of Design (2017), in recognition of creativity and innovation towards building a sustainable, resilient, and inclusive community (Caspani 2019). Geelong experiences a marine temperate (Cfb) climate under the Köppen climate classification system, with a mean annual precipitation of 522 mm and mean maximum and minimum temperatures of 24.7 °C and 7.5 °C.

Fig. 2
figure 2

Location map for study sites at Lara (Grand Lakes) and Armstrong Creek (Warralily)

Consistent with its UNESCO City of Design designation, Irvine et al. (2020) noted that Greater Geelong maintains more than 200 small streetscape scale and 150 large end-of-pipe WSUD assets, but our study focused on two large constructed wetland systems, one in the northern suburbs of Lara, at the Grand Lakes Estate development, and the other to the south at Warralily (Figs. 3 and 4) in Armstrong Creek. Both wetland areas can be considered examples of NbS, as they function to manage runoff quantity and quality, but also provide community wellbeing and habitat opportunities. The Grand Lakes NbS design is 16.75 ha, including a wetland water surface area of 5.63 ha. Opened in 2013, Grand Lakes includes boardwalks and open space for community recreation, a coffee shop and restaurant with look out over the wetland, and a diverse habitat. The wetland consists of eight cells. Sedimentation basins are present at the inlet zone of each cell to capture sediment from the catchment stormwater runoff. Inlet zones connect to macrophyte zones consisting of a combination of vegetated (marsh) and open water areas. High-flow bypass channels in the form of rock swales link wetland cells across the entire Grand Lakes wetland. The 600 ha catchment area is 18% low-density, single-family residential, 80% agriculture, as well as a small mix of commercial uses, light industry, and the NbS wetland/park. The Grand Lakes Estate won the urban development industry’s national award for environmental excellence in 2016, largely based on the innovative NbS design.

Fig. 3
figure 3

Grand Lakes constructed wetland and NbS area. Green lines represent major subsurface drainage pipes, red dots represent catch-basin locations, and yellow lines represent surface drainage channels

Fig. 4
figure 4

Warralily constructed wetland and NbS area

The Warralily constructed wetland is considerably smaller, with a water surface area of 0.73 ha, and also is newer, being completed in early 2017. The wetland drains residential runoff to a single inlet and has three cells, each of which are separated by a dense macrophyte cover. A high-flow bypass channel exists on the northwestern side of the wetland while a recreational pathway lines the southeastern side of the wetland. Greater detail on the Warralily wetland characteristics is found in Dharmasena et al. (2021), but briefly, the wetland serves a low-density, single-family residential area of about 85.2 ha and runoff is conveyed into the wetland via a single 1200 mm diameter inlet. The catchment is approximately 60% impervious.

2.2 Field methods

Samples for both constructed wetlands were collected between 12 and 18 December 2017 using a PVC tube (internal diameter of 21 mm) that was inserted into the sediment to a depth of 6–8 cm. The suction associated with the small diameter kept the sediment sample within the tube until retrieval from the water and subsequent extrusion on shore. The tubes were cleaned with water between each sample site. Boats were not available, so sampling was done to a wading depth or from boardwalk areas extending over the water (Fig. 5). Samples were collected from 17 sites in the Warralily wetland, representing all three cells, as well as the inlet and outlet of the wetland. Samples were collected from 38 sites in the Grand Lakes wetland, representing all eight cells. The upper 1–6 cm of samples extruded from the PVC tubes were collected in ziplock plastic bags for subsequent processing in the laboratory at Deakin University. Where the depth of water allowed, at selected sites, a plastic trowel was used to scoop bed sediment for textural analysis, which was stored in plastic ziplock bags until analysis in the laboratory.

Fig. 5
figure 5

Sampling in Warralily constructed wetland (left) by wading and in Grand Lakes constructed wetland (right) from a boardwalk traversing the open water

Sample collection for macroinvertebrate analysis was done at five sites near both the inlet and outlet of the Warralily constructed wetland. A multihabitat sampling approach was taken at the inlet and outlet to ensure representative microhabitats were included for analysis (Roy et al. 2003). The microhabitats consisted of submerged macrophytes, floating macrophytes, areas with high density algae, littoral areas that were free of vegetation, and stagnant waters. At each site, a 2-min sweep sample was conducted using a 500-μm sieve to collect the invertebrates. The sieve was deliberately swept through macrophyte beds and marginal vegetation. In open water and submerged macrophytes, the sieve was moved in a zigzag vertical motion. In emergent macrophytes, a trowel was scraped along the submerged vegetation surface, upwards from the bases of the plants to extract the invertebrates. Water quality profiling from surface to bed (temperature, dissolved oxygen, pH, conductivity, chlorophyll a) was done at each sample site using a YSI 6920 datasonde.

2.3 Laboratory methods

For metal analysis, sediment samples (approximately 7 g) were air dried, ground with a pestle and mortar, placed in disposable Teflon cups, and sealed with a polypropylene XRF ultralene film, following the methods used by Andreas et al. (2019). Samples were analyzed using a Bruker S1 Titan 600 X-Ray Fluorescence (XRF) system set to the soils mode library with a run time of 2 min/sample. For QA/QC purposes certified reference material (CRM) (NIST SRM2710X, Montana I soils) was analysed at the start and end of each analytical batch and duplicate samples were run on 24% of the samples. While XRF analysis of environmental samples may be less common than traditional techniques, such as inductively coupled plasma (ICP) spectrometry or atomic absorption spectroscopy (AAS), many studies of different media (soils, sediments, skin-whitening creams) have shown that XRF can produce comparable results, certainly at a screening-level, with the added advantage of faster workflow because pre-chemistry is not required (e.g., Radu and Diamond 2009; Murphy et al. 2012; Tepanosyan et al. 2022).

Sediment samples for textural analysis were oven-dried at 105 °C and organic material visible with the naked eye was manually removed. The samples then were ground with a pestle and mortar. Approximately 100 g of each sample was dry-sieved, with the smallest sieve size being 0.075 mm. If more than 10% of the dry-sieved sample collected in the bottom pan, this fine fraction was further analysed by hydrometer.

For the macroinvertebrate analysis, the sediment was thoroughly rinsed through a 500 μm mesh sieve to remove fine sediment. A wash bottle was used to rinse the invertebrates and smaller debris from the larger particles. Large organic material (leaves, twigs, algae, etc.) were visually identified and discarded. The macroinvertebrates were extracted with Teflon-coated forceps and placed in tightly capped bottles with 70% Isopropyl alcohol and stored before sorting and identification. Macroinvertebrate samples subsequently were hand-sorted and spread on a Petri dish for ease of viewing. The samples were identified by visual examination with the naked eye to the lowest practical level using identification keys from Ingram et al. (1997) and Gooderham and Tsyrlin (2002).

2.4 Data analysis

2.4.1 Statistical and geospatial analysis

Summary and inferential statistics were calculated using Microsoft Excel. Ordinary Kriging in ArcMap V.10.3 was applied for geospatial analysis of the metal levels in the wetland sediments. Ordinary Kriging is a geospatial extrapolation technique that employs a weighted neighbour, covariance approach, and frequently is applied to spatial analysis of sediments (e.g., Andreas et al. 2019).

2.4.2 Macroinvertebrate analysis

The Shannon-Weiner Diversity Index was used to summarize the benthic macroinvertebrate survey from the Warralily wetland:

$${H}^{\mathrm{^{\prime}}}= -\sum \left(\frac{{n}_{i}}{N} \cdot ln\frac{{n}_{i}}{N}\right)$$
(1)

A community with only one species would have an \({H}^{^{\prime}}\) value of 0 because \(ln\frac{{n}_{i}}{N}\) would be 0. If the species are evenly distributed then the \({H}^{^{\prime}}\) value would be comparatively higher. As such, the \({H}^{^{\prime}}\) value allows us to know not only the number of species but how the abundance of the species is distributed within the community. The Shannon-Weiner Diversity Index is used globally to assess sediment and ecosystem health (Wijeyaratne and Liyanage 2021) and has been applied for wetlands in Australia (Awal and Svozil 2010).

2.5 Literature review

We conducted a semi-systematic literature review (Snyder 2019) that generally followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Selçuk 2019; Page et al. 2021), although we did not conduct a meta-analysis of data or undertake quantitative analysis. The literature search was undertaken using Google Scholar with the key terms: metal levels in urban street dust, metals in stormwater runoff, metal accumulation in stormwater detention ponds, sediment accumulation and management in detention ponds, metal deposition in constructed wetlands, sediment forebays in constructed wetland design, metal treatment by raingardens, NbS and metal treatment, WSUD and metal treatment, treatment wetlands and habitat, WSUD design criteria for treatment of stormwater runoff, and macroinvertebrate diversity. A minimum of 100 papers initially was identified for each key term phrase. These papers were screened for relevancy, clarity and completeness of sampling and analytical methodology, with a minimum of 10 papers in each key term phrase ultimately being selected for full review. The exception to this key word assessment was for urban street dust in which 200 papers published post-2005 initially were identified and 91 were reviewed in detail. The reason for this additional focus on urban street dust was because it is an important source of metals to NbS features and it was a featured theme of our past collaborative efforts with Dr Droppo.

3 Results and discussion for the Warralily and Grand Lakes constructed wetlands

3.1 Metal levels

Although XRF provides results for a suite of metals, in this study we focused on Cd, Cr, Cu, Pb, and Zn, as these are commonly reported in the literature for urban environments. The Relative Per Cent Difference (%RPD) between the CRM and the laboratory measurements at the start and end of batch analyses was < 10% for Pb, Cu, Zn, and Cd, which is an acceptable range for XRF units (US EPA 2006). The Cr %RPD was poor, at 70%, but the CRM level is close to the Limit of Detection (LOD) for this XRF unit and as such there is greater uncertainty for the Cr levels in this study. The mean %RPD between duplicate samples was: Cd (2%), Cr (34%), Cu (21%), Pb (2%), and Zn (16%). Except for Cr, which is slightly high, the %RPD values are within an acceptable range (Kimbrough and Wakakuwa 1989).

The summary statistics for total metal levels from the two constructed wetlands are shown in Table 1. The Warralily wetland, which is more recently constructed and has a lower density residential area than Grand Lakes, exhibits lower mean concentrations of Pb, Cu, and Zn, although due to the large standard deviations, the differences are associated with relatively high P-values (0.07–0.13) for one-tailed, two sample t-tests. The median values (Table 1) also indicate that a small number of higher concentrations affect mean Pb, Cu, and Zn for Grand Lakes and given the relatively high P-values the t-test results should be interpreted with some caution. The mean level of Cr in the Warralily wetland is significantly higher than for Grand Lakes (P = 0.00008) based on a one-tailed, two sample t-test.

Table 1 Summary statistics for metal levels in the Warralily and Grand Lakes constructed wetlands

Various approaches have been used to assess metal levels in wetlands, including enrichment factors, a geo-accumulation index (Igeo), and sediment quality standards (Nasirian et al. 2016; Xu et al. 2019; Yan et al. 2020). Our preference is to compare results using sediment quality standards since they are derived through epidemiological evidence related to the aquatic organisms of interest and thereby reflect potential biological impact. Different countries throughout the world have established sediment quality guidelines, but MacDonald et al. (2000) conducted an extensive survey, and using a consensus-based approach, recommended sediment quality guidelines (SQGs) for 28 chemicals of concern in freshwater sediments whereby two trigger levels were identified. The first trigger level, known as the Threshold Effect Concentration (TEC) represents the level below which harmful effects on aquatic organisms are not expected, while the Probable Effect Concentration (PEC) represents a value, above which, there is likely to be a frequent adverse effect on aquatic organisms. The TEC and PEC values recommended by MacDonald et al. (2000) are summarized in Table 2. Australia has developed Default Guideline Values (DGVs) and Upper Guideline Values (GV-high) for sediment (https://www.waterquality.gov.au/anz-guidelines/guideline-values/default/sediment-quality-toxicants) and these also are presented in Table 2. Similar to the TEC, the Australian sediment DGVs are meant to represent concentrations below which there is a low risk of harmful impact, while the GV-High values indicate concentrations at which toxicity-related adverse effects already may be observed. The Australian guidelines also suggest: (i) sediment < 2 mm in diameter be assessed; (ii) metals concentrations could be normalized for organic carbon content; and (iii) multiple lines of evidence (e.g., a combination of chemistry, laboratory ecotoxicology, and field ecology (including sensitive species distribution); Batley and Simpson 2008) be used in a weight-of-evidence approach to better assess risk to a sediment ecosystem. Since the TEC and PEC values are more conservative than the DGV and GV-high values (Table 2), we have elected to focus comparisons of our results to the TEC and PEC.

Table 2 TEC and PEC consensus-based sediment quality guidelines (µg g−1) from MacDonald et al. (2000) and Australian Default Guideline Values (DGVs) and Upper Guideline Values (GV-high) (µg g−1, dry weight) from https://www.waterquality.gov.au/anz-guidelines/guideline-values/default/sediment-quality-toxicants

Comparing the PEC values with the individual sample site values for the two constructed wetlands, it seems that the sediments are not grossly polluted (Table 1), but there are PEC exceedances and given the relatively short history of the wetlands, it would be prudent to establish a regular monitoring program. We emphasize here that both the SQG values and our XRF results represent total metal levels and are not fractionated through any sequential extraction to assess bioavailability.

The generally large standard deviations in Table 1 suggested that there may be some spatial variability for the metal levels within the two constructed wetlands, which was explored using Ordinary Kriging in ArcMap V.10.3. The Warralily wetland exhibited higher levels for some metals such as Cr and Ni (because there was no CRM for Ni, we do not report results here) near the inlet and lower levels near the outlet (Fig. 6). Other metals at Warralily (Pb, Cu, Zn) did not exhibit a clear spatial trend (Fig. 6). The central area of the Grand Lakes wetland exhibited a dominant trend of higher metal levels in the sedimentation forebay and macrophyte zone consisting of a combined vegetated (marsh) and open water area that receives a large discharge from the adjacent residential area (Fig. 7).

Fig. 6
figure 6

Spatial distribution of Cr levels (top) and Pb levels (bottom) in Warralily wetland sediment

Fig. 7
figure 7

Spatial distribution of Cu, Pb, and Zn in Grand Lakes wetland sediment

3.2 Benthic macroinvertebrates

Collectively, the five effluent sites scored a higher \({H}^{^{\prime}}\) (1.602) compared to the five influent sites (0.699) of the wetland. If the five individual sites at each of the inlet and outlet locations are considered separately, a two-sample, one-tailed t-test showed the mean \({H}^{^{\prime}}\) value at the outlet site to be significantly greater than the mean value at the inlet site (P = 0.034). The lower \({H}^{^{\prime}}\) value at the inlet may be expected, as higher levels of pollutants generally reduce aquatic ecosystem diversity (Johnston and Roberts, 2009). Furthermore, the presence or absence of macroinvertebrate species and their associated tolerance to pollutants often is used as an indicator to evaluate the level of pollution in aquatic ecosystems (Ollis et al. 2006; Gomes and Wai 2020). The majority of benthic organisms at the influent area belonged to the very pollutant tolerant (Left-handed Snail, Aquatic Worms and Blood Midge) and fairly pollutant tolerant group (Midges). In the effluent area, four of eight taxa found are pollutant intolerant (caddisfly larvae, riffle beetle, water penny, and right-handed snail).

The results of the benthic macroinvertebrate analysis suggest that macroinvertebrate community health and diversity increase between the inlet and outlet of the wetland. In part, this pattern may be related to the spatial trends of some metals (e.g., Cr) associated with bed sediment in the wetland (Fig. 6). Of course, there may be other factors that affect the benthic macroinvertebrate community assemblage, including water quality, sediment texture, and vegetation (Zhou et al. 2019; Dalu and Chauke 2020). The YSI 6920 datasonde measurements indicated that the mean chlorophyll a levels for both the inlet (11.14 µg l−1) and outlet (13.23 µg l−1) areas are in the general mesotrophic range (Irvine and Murphy 2009) and below the default trigger value for wetland ecosystems in southwest Australia of 30 μg l−1, under Australian water quality guidelines. The mean dissolved oxygen levels at both the inlet (5.88 mg l−1) and outlet (5.93 mg l−1) also were sufficient to support a diversity of organisms. It is recognized that the water quality monitoring was conducted over a short period of time and that benthic macroinvertebrate assemblages would reflect the integration of longer-term water quality trends. The texture of the two bed sediment samples collected near the wetland inlet tended to be coarser, with an average of 29.4% gravel and 4.8% silts and clays, compared to the two outlet sites which had an average of 11.9% gravel and 23.6% silts and clays. Given hydraulic sorting in a well-designed constructed wetland, this size gradation is not unusual. Studies have reported differing impacts from sediment size with respect to diversity and abundance of macroinvertebrates. For example, Duan et al. (2008) reported a more diverse community associated with coarser material (gravel). Khudhair et al. (2019) acknowledged that while some studies had found a greater diversity with coarser material, a silt-humus mix exhibited greatest macroinvertebrate diversity at their study wetland in China. The H' values for this study wetland in China ranged between 0.693 and 2.558 (Khudhair et al. 2019). In comparison, Foomani et al. (2020) reported H' values in the range of 0.15–1.55 for an anthropogenically impacted wetland in Iran. The various physical and chemical factors that could impact the macroinvertebrate population in the wetlands should be explored in more detail and, in particular, the macroinvertebrate diversity analysis should be extended to other wetland systems in Geelong and include a longer monitoring time frame to reduce uncertainty in the analysis.

4 Discussion

4.1 Sources of particle-bound metals

Following the urban continuum concept discussed in the Introduction, we expect that the primary source of particle-bound metals to an NbS design, such as the Geelong wetlands, would be washoff from impervious surfaces and erosion associated with pervious surfaces. Vermette et al. (1987) outlined six general “layers” or sources of particle-bound metals in the urban environment that generally were attributed to natural contributions (local bedrock, local soils) and anthropogenic contributions (including the urban mosaic of building materials, ranging from concrete to asphalt; as well as inputs from local emissions and long-range transport, or re-entrainment from historical deposition). Considerable research has focused on the relationship between land use and metal levels, with conventional wisdom suggesting, for example, that street sediment from industrial areas and higher density traffic areas may be associated with higher metal levels. However, short range atmospheric redistribution (e.g., due to resuspension by natural wind or vehicle-associated turbulence) and short-range atmospheric transport can confound these simple relations. For example, residential areas near heavy industry that has active smoke stack emissions may have higher metal levels than residential areas in the same city, but distant to the industry. Anthropogenic sources can accumulate in association with both pervious and impervious surfaces and while most research has focused on impervious surface washoff, some research has pointed to the importance of eroding pervious urban surfaces as well (Irvine et al. 1992; Shikangalah et al. 2016; Ferreira et al. 2021). As a cautionary note, Muller et al. (2020) provided a more recent review of pollution sources in urban runoff and concluded that …because of the rapid advances in clean manufacturing and pollution control technologies, a large part of the body of data on stormwater quality available in the literature should be considered as historical data, which may no longer describe well the current conditions.

Our literature search using the key term “metal levels in urban street dust” returned 89,000 hits. Much of the early work on metals in street sediment (1980s–2000) was focused on North America and Europe (Harrison 1979; Hopke et al. 1980; Ellis and Revitt 1982; Dong et al. 1984; Vermette et al. 1987, 1991; Pitt et al. 2005), of course, with exceptions (e.g., Ramlan and Badri 1989 (Malaysia); Hewitt and Candy 1990 (Ecuador); Ho 1990 (Hong Kong); Salim Akhter and Madany 1993 (Bahrain)). The refined search of the literature from 2005 onward (91 papers fully reviewed) showed a perceptible spatial shift in the origin of these more recent papers, with China having 39% of the publications; South and Central Asia 18%; Europe 14%; Africa 9%, Middle East 5%, Southeast Asia 2%, North America, South America, Australia, and Russia having 1% each, and 3% of the papers had a global focus. Haynes et al. (2020) also reported that Asia, and particularly China, had an increasing presence in the street sediment literature. Much of the earlier work tended to focus on characterization of the elemental composition to better ascertain pollutant sources and/or the degree to which they may contribute to the degradation of the aquatic environment (Irvine et al. 2009). Since 2005, there seems to have been a thematic shift that is particularly pronounced regionally, as human health risk is the primary focus of 50% of publications from Asia, followed by source identification (24%), degradation of the aquatic environment (9%), general characterization of metal levels and related aspects (15%), and seasonal or temporal variability (2%). The human health risk theme is less pronounced for the rest of the world’s publications since 2005 (22%), with source identification accounting for 47% of the publications, degradation of the aquatic environment (16%), general characterization of metal levels and related aspects (12%), and seasonal or temporal variability (3%).

In assessing human health risk or degradation of the aquatic environment, the bioavailability or speciation of the metals ideally would be determined using some form of sequential extraction (cf. Tessier et al. 1979; Sutherland and Tack 2007). There are early (pre-2005) examples of such analyses (e.g., Gibson and Farmer 1984; Hamilton et al. 1984; Ramlan and Badri 1989; Stone and Marsalek 1996; Sutherland 2002), although in our review of the more recent literature, only 17% of the publications had conducted some type of speciation analysis. Frequently, it is reported in the literature that metal levels in aquatic systems are related to the size of the sediment, as smaller particles (particularly the colloidal sizes) have greater adsorption capacity and also may preferentially flocculate to maximize adsorption opportunity (Droppo 2001; Leppard and Droppo 2005; Förstner and Wittmann 2012; Unda-Calvo et al. 2019). However, as was the case with metals speciation, since 2005, a relatively low percentage of studies (13% from our review) evaluated the relationship between size and metal levels, but consistent with the earlier literature a general inverse relationship between particle size and metal levels was reported (e.g., Xu et al. 2013; Khademi et al. 2020). This issue is particularly critical since air and water quality management options, including street sweeping and settling of particle-bound metals, will be impacted by particle size (Irvine and Droppo 1999; Irvine et al. 2009; Owens et al. 2011; Hixon and Dymond 2018).

Two additional points of note were reflected in the literature reviewed from 2005 onward. First, we only reviewed papers that provided sufficient detail on their sampling and analytical methodologies and found a variety of analytical instrumentation has been employed by these different studies, including atomic absorption spectroscopy (AAS) with various furnace types, inductively coupled plasma with optical emission spectrometry (ICP-OES), inductively coupled plasma with atomic emission spectroscopy (ICP-AES); atomic emission spectroscopy with mass spectrometry (ICP-MS), and X-ray fluorescence (XRF). Different analytical techniques can introduce some uncertainty in comparing results between projects. Secondly, pre-treatment of the sampled street dust typically included some type of dry sieving as a first step, to remove litter, organic material such as leaves, and coarser sediment (e.g., gravels). However, the reported sieve size used in this step included 2 mm, 1 mm, 0.75 mm, 0.355 mm, 0.15 mm, 0.125 mm, and 0.063 mm. This variation also generates uncertainty in comparing different projects.

Despite some of the analytical and operational challenges in undertaking comparisons between studies, it is not uncommon that the study site results are placed in context with metal levels reported from other locations in the world (e.g., Hu et al. 2016; Cai and Li 2019; Rahman et al. 2019; Jahandari 2020). Recent publications by Haynes et al. (2020), Roy et al. (2022), and Dietrich et al. 2022) have provided a more rigorous global review of metal levels in street dust and all noted similar shortcomings in standardization of analytical methods, while Dietrich et al. (2022) also called for a greater diversity of studies representing different climate types.

4.2 Metal accumulation in stormwater ponds, WSUD features, and NbS designs

Marsalek et al. (1992) noted that from the 1970s stormwater ponds increasingly were used to manage urban runoff quantity and reduce localized flooding, but by the 1990s, designs were changing to focus on water quality improvement as well (Mayer et al. 1996). Mayer et al. (1996) concluded that the early stormwater ponds (1970s–1990s) generally had less treatment capacity because of sub-optimal settling considerations, while later pond designs improved settling characteristics and treatment (e.g., Egemose et al. 2015). There appears to be conflicting reports on the dynamics of metal accumulation in stormwater pond sediments. Casey et al. (2007) suggested that metal levels in sediment and invertebrates were at steady state in the ponds they examined and that the risk to organisms did not vary as a function of pond age, while Egemose et al. (2015) found that recent ponds (1–2 years old) efficiently removed a suite of metals, but ponds 30–40 years old removed only Pb, Ni and Zn, and with decreasing effectiveness over time. In contrast to our results for the Geelong wetlands, German and Svensson (2005) found that metal levels did not vary much between or within the ponds they studied. Varying levels of risk to aquatic organisms have been reported for the sediment in stormwater ponds. Camponelli et al. (2010) reported that Cu levels frequently exceeded threshold effect concentrations and Zn often exceeded probable effect concentrations, while Casey et al. (2007) found Pb and Cu levels generally remained below threshold effects concentrations in their study. Gallagher et al. (2011) noted that 96% of the 68 stormwater ponds that they surveyed in the state of Maryland, USA, exceeded threshold effect concentrations for at least one trace metal. Waara and Johansen (2021) concluded that ecological risk for organisms in the older ponds from industrial areas that they sampled commonly occurred for Cd, Cu, and Zn. However, Istenič et al. (2012) found that the movement of metals from plant roots to the above ground tissues of plants was low and that the risk of metals transfer to the surrounding ecosystem was negligible. In a study with caged mussels, Soberg et al. (2016) showed that the metal levels in the stormwater ponds did not pose a risk to habitat function. Similarly, in a study of metal levels in sediment and aquatic flora and fauna (individual species not identified) from nine stormwater ponds and 11 small natural lakes in Denmark, Stephansen et al. (2014) found no difference between the stormwater ponds and the lakes. Karouna-Renie and Sparling (2001) reported that the levels of Cu, Zn, and Pb in invertebrates from 20 ponds draining commercial, residential, highways, and open spaces in the State of Maryland were less than dietary concentrations considered toxic to fish. With respect to pond design, Egemose et al. (2015) concluded that pond treatment efficacy was greater with a higher pond volume to catchment area ratio, while Starzek et al. (2005) in their survey of 26 stormwater ponds in Sweden also found a general trend towards greater accumulation rates in larger ponds, although pond shape did not seem to have an impact on deposition.

Moving forward from the stormwater pond era into the more recent WSUD period, accumulation of metals in bed sediments has been evaluated for constructed wetlands, but the focus on bioretention cells has been more related to quantifying concentration or loading reductions by comparing inputs and outputs (Shafique and Kim 2015; Ahammed 2017; Flanagan et al. 2018; Shchukin 2021), with very little done on substrate accumulation (Guo et al. 2021). Baddar et al. (2021) reported that total concentrations of Cu and Zn increased in a constructed wetland in the State of South Carolina, USA, over a seven-year period from 6.0 ± 2.8 and 14.6 ± 4.5 mg kg−1 to 139.6 ± 87.7 and 279.3 ± 202.9 mg kg−1 dry weight, respectively. The latter period values are similar to those reported for the Grand Lakes wetland (Table 1). As with the stormwater pond research, concern about metals impact on habitat and aquatic organisms has been expressed. Gill et al. (2014), for example, monitored the accumulation of heavy metals in both the sediment and the plants growing in an Irish wetland over a 6-year period of operation and concluded that metal accumulation in the vegetation compared to the sediment was negligible. Malayvia and Singh (2012) observed that while design criteria for constructed wetlands treating municipal wastewater exist (e.g., Polprasert and Koottatep 2017), similar criteria for wetlands treating urban and highway runoff had not been fully established because urban runoff is more difficult to control due to the non-point source uncertainties and flashiness of the rainfall-runoff system. Al-Rubaei et al. (2016) evaluated a constructed wetland in Sweden that had been operating for 19 years and found that it still effectively managed Cd, Cu, Pb, Zn, TSS, and TP, with treatment efficacy for event mean concentrations ranging between 89 and 96%. It was noted that apart from removal of sediment in the forebay, no other maintenance had been conducted, with the conclusion being that if designed well and regularly inspected, constructed wetlands can work efficiently for at least two decades.

Finally, we note that of all the studies on metal accumulation in pond and wetland sediment reviewed herein (and including our study of the Geelong wetlands), only Camponelli et al. (2010) included a sequential extraction/bioavailability assessment, in which they concluded that 93% of the Cu concentrations were not readily bioavailable, while 40% of Zn was not bioavailable. The focus on total metals concentration perhaps is understandable, since it is analytically simpler and most sediment quality guidelines are established for total rather than fractionated metals.

As it is an emerging field, the NbS literature is newer and while it has focused extensively on socio-economic, design, and policy issues (Frantzeskaki 2019; Collier and Bourke 2020; Tozer et al. 2020; Li et al. 2021; Jessup et al. 2021; Dorst et al. 2022), ecosystem service provisioning, including water quality treatment and flood management, also is being investigated (Oral et al. 2020; Dutta et al. 2021; Kumar et al. 2021). Although there is relatively little work specifically on metal accumulation in sediment, Conte et al. (2020) described an interesting community-based project that explored metal accumulation in soils and plants as part of a phytoremediation/urban gardens demonstration. Because of its more holistic approach, NbS seems to have greater potential to both engage the public and facilitate multidisciplinary interactions.

Specifically in relation to the Warallily and Grand Lakes constructed wetlands, we note that Melbourne Water in association with the Victoria State Government recently (Melbourne Water 2020https://www.melbournewater.com.au/building-and-works/developer-guides-and-resources/standards-and-specifications/constructed-wetlands) revised their 2010 design guidelines for constructed wetlands. The Warallily and Grand Lakes constructed wetland designs were consistent with these guidelines, as summarized by the general design guideline schematic presented in Fig. 8. While the Melbourne Water design guidelines generally recognize the value of constructed wetlands in treating metals, pathogens, and organic compounds, current quantitative requirements only target suspended solids (80% retention of the typical urban annual load), total phosphorus (45% retention of the typical urban annual load), and total nitrogen (45% retention of the typical urban annual load). The combined sedimentation basin/forebay and macrophyte plantings at the Grand Lakes wetland were particularly effective at trapping metals prior to distribution through the remaining wetland area (Fig. 7), although periodic dredging would be required to maintain long-term treatment capacity (e.g., Al-Rubaei et al. 2016).

Fig. 8
figure 8

Constructed wetland design plan view (left) and longitudinal profile (right) for stormwater management as outlined by Melbourne Water (https://www.melbournewater.com.au/building-and-works/developer-guides-and-resources/standards-and-specifications/constructed-wetlands). NWL, normal water level; TED, top of extended detention; EDD, extended detention depth

4.3 Implications for nature-based solution design–current practices and knowledge gaps

Although there is relatively less technical literature on NbS design, per se, at this point it is possible to adapt past research and guidelines on stormwater ponds, constructed wetlands, and other WSUD features to guide the NbS design process (e.g., Wang et al. 2017; PUB 2018; Malaviya et al. 2019; Melbourne Water 2020, https://www.melbournewater.com.au/building-and-works/developer-guides-and-resources/standards-and-specifications/constructed-wetlands). The recent focus on NbS really is an extension of early urban design and landscape architecture typologies, for example, the green visions of Frederick Law Olmsted in North America (ca. 1857–1895) that included Central Park in New York City and Llewellyn Park (1857), an English-inspired suburb of New Jersey, with expanses of forest, parkland, and water features; Ebenezer Howard’s Garden Cities of Tomorrow (ca. 1902); and Ian McHarg’s seminal 1969 work, Design with Nature. But further to these earlier works, NbS has tapped into a more collective sense of community, sustainability, resiliency, and smart city growth with its holistic approach to design (Frantzeskaki 2019; Ruangpan et al. 2020; Irvine et al. 2022). While some have cautioned against the use of natural systems, such as wetlands, for the dual purposes of water quality improvement and aquatic habitat enhancement (Helfield and Diamond 1997), our review suggests there have been mixed results with respect aquatic organism impacts and we recommend that site specific sampling and investigations should be an important component of any NbS development plan. Our results for the wetlands in Geelong indicate that levels of Pb, Cu, Zn, Cr, and Cd are elevated at some sample sites, reaching the probable effect concentration, and this may have an impact on benthic macroinvertebrate community assemblage in those locations. Such findings warrant further monitoring, and certainly, an analytical tool such as XRF can be helpful in quick screening. XRF can rapidly provide accurate information on total concentrations at low levels, with no pre-chemistry required. However, as noted, there are relatively fewer studies of urban street dust and stormwater pond/wetland sediments that have implemented sequential extraction techniques to better assess bioavailability. This type of confirmation analysis should be done if screening of total concentration indicates an issue with elevated metal levels. Furthermore, as reflected by Table 2, there are differences in adopted sediment quality guidelines and more work on resolving these differences and refining the guidelines is needed. A weight-of-evidence approach to assessment of ecological risk, as suggested in the Australian guidelines, is worth pursuing and should include evaluation of macroinvertebrate assemblages (population density and diversity) as well as bioaccumulation of metals in sentinel species. The Warralily constructed wetland case study illustrated the value of assessing macroinvertebrate assemblages to enhance understanding of contaminant accumulation in constructed wetlands.

Despite advances in our knowledge on green infrastructure performance in recent decades, a number of knowledge gaps exist, in addition to the issue of metals bioavailability:

Metal accumulation in bioretention features

While the information on metal accumulation in stormwater ponds and wetlands is fairly robust, very limited information is available on metal accumulation in bioretention features, as to date, most of the research has focused on determining efficacy of treatment by comparing concentrations in runoff at the inlet and outlet. It is important to assess metal accumulation in bioretention features to help develop recommendations for maintenance, including substrate renewal, and risk to biota.

Design

Technical aspects of NbS feature designs in general are reaching maturity. For example, sediment forebays and mixed areas of emergent macrophytes and open water seem successful, as demonstrated in the wetlands of Geelong and other studies reported in the literature. However, while there is growing experience with NbS designs in Singapore (e.g., Wang et al. 2017; Irvine et al. 2021a, b), generally NbS designs have been developed for temperate climates and there is a need to understand design implications for tropical climates. Design guidelines generally are developed for individual WSUD features, which may be appropriate in older development retrofits. When planning for new developments, however, the focus of NbS is holistic and systematic. As such, the features should be designed and performance assessed as a collective set within a connected system. Connectivity should be an essential component of NbS designs, for optimal flow of water, organisms, and people. This point is underscored by the Melbourne Water (2020, https://www.melbournewater.com.au/building-and-works/developer-guides-and-resources/standards-and-specifications/constructed-wetlands) design guidelines: Connectivity is a vital component of stream ecology. Connectivity, maintains baseflow conditions, provides passage for fish, invertebrates and other biota within the waterway, and facilitates the movement of water borne plant propagules within the waterway. While existing engineering design practices address individual features rather than using a treatment train approach, linear treatment wetlands, as a system, are starting to be implemented in Geelong and engineering design guidelines should be revised to consider this emerging practice. Furthermore, Australian guidelines for WSUD treatment features only consider suspended sediment and nutrients (which is also the case in Singapore, for example), although removal of suspended sediment will in some way address particle-bound metals. Further work is required to consider design guidelines for other parameters, including emerging contaminants like micro plastics.

Role of vegetation and substrate amendments on enhancing treatment performance

Research has been conducted on the role of different plants in mitigating metal levels through provision of flow resistance to enhance sediment settling and uptake from the sediment. However, more research is needed on the types of plants that are most effective in these tasks, as well as being robust, and minimizing maintenance costs for different types of NbS features and in different climates. Similarly, more work is needed on the utility of appropriate, low-cost, and local substrate amendments, such as lava stone or biochar.

Full cost accounting

Construction and maintenance costs (for the life of the design) should be clearly evaluated. While construction costs of NbS systems generally are well-established, considerable uncertainty remains with respect to maintenance costs (e.g., substrate renewal and contaminant disposal, vegetation harvesting) and who is responsible for these costs (the developer, the municipality, the homeowner or community)? To this end, assessment of ecosystem services and dis-services, that might include considerations of water yield, sediment, nutrient, and metals yields, carbon sequestering, urban heat island mitigation, air quality mitigation, noise mitigation, habitat provisioning, and aesthetics seems to be a promising framework.

Urban runoff and street dust as sources to NbS

As we have noted, considerable research has been done on metal levels in urban street dust and washoff of this dust is an important source of metals discharging to NbS features. Street-sweeping efficiency is particle-size dependent and research has been conducted to improve street-sweeping efficiency, thereby reducing metals loads in the runoff (Irvine et al. 2009). Particle-bound metals in the urban runoff continuum will be affected by the physical and biological characteristics of these particles (Droppo et al. 2002). Distributed, but connected, NbS features (including grassed swales) and traditional hard engineering designs (e.g., minor drainage system gulley pots or catch-basins) should continue to be included for management of particle-bound metals.

5 Conclusion

Two residential constructed wetlands in Geelong, Australia, were studied as examples of nature-based solution designs to manage stormwater runoff quality. The levels of Pb, Cu, Zn, Cr, and Cd in a limited number of samples exceeded the consensus-based probable effect concentration for aquatic organisms. The samples with these elevated levels were found in association with the designed depositional areas of the wetlands and were not evenly distributed throughout the entire wetlands. These sedimentation areas should help to minimize maintenance efforts in the future by focusing the areas of accumulation. Analysis of the benthic macroinvertebrate communities in one of the wetlands indicated that there was a less diverse and more pollutant-tolerant assemblage near the inlet sedimentation area, compared to the outlet area. Regular follow-up monitoring of the metal levels and benthic macroinvertebrate community should be conducted.

In combination with the case study of Geelong, a literature review suggested five areas that need further research with respect to metals dynamics and NbS: (i) metal accumulation in bioretention features; (ii) design aspects of NbS in relation to management of particle-bound metals, particularly with respect to different climate types, the performance of a connected system, and developing a standardized weight-of-evidence approach to assessing ecosystem health; (iii) better understanding of the role of vegetation and substrate amendments in improving NbS performance; (iv) full cost accounting of NbS designs, including a better understanding of maintenance requirements and costs, explicitly using an ecosystem services framework; and (v) recognizing the role of urban runoff and street dust characteristics in affecting source discharges to NbS features with a view to better managing the system through distributed green and traditional hard engineering approaches.