Introduction

Alteration of a river’s flow regime, through the construction and operation of dams and weirs, is arguably the most significant threat to the ecological health of the world’s rivers (Sparks 1995; Bunn and Arthington 2002). The use of environmental flows (often also termed environmental watering) is a relatively new restoration technique aimed at returning critical flow components to flow-altered rivers (Arthington et al. 2006, 2010). Environmental flows and its associated scientific discipline, has been rapidly growing throughout the world; with a great deal of scientific attention focusing on developing approaches to determine the type and volume of flow to be restored (Richter et al. 2003; Acreman and Dunbar 2004). While many environmental flow regimes have been developed and implemented (see for example reviews by Arthington 2012; Gillespie et al. 2014; Olden et al. 2014), there are comparatively few examples of long-term (>3 years) monitoring studies designed to determine the ecological responses to the use of environmental flows (Davies et al. 2014; Olden et al. 2014; but also see examples of long-term studies: Robinson et al. 2003; Robinson and Uehlinger 2008; Bradford et al. 2011; Melis et al. 2012). This is despite the obvious and urgent need to both (1) demonstrate the benefits of environmental flows to managers, the broader public and politicians (Poff et al. 2003); and (2) improve future management of environmental flows for better ecological outcomes. The lack of long-term monitoring studies of the ecological responses to environmental flows has led scientists and policy makers to challenge the discipline to progress faster and in a more rigorous manner, to ensure transparent and defensible decisions, and to develop a suitable body of evidence to support water allocation decisions (Poff et al. 2003; Cottingham et al. 2005; Arthington et al. 2010; Bradford et al. 2011; Olden et al. 2014).

In Australia, Federal and State Governments are implementing large and significant programs to either return environmental flows to flow-altered rivers, or to protect flows in flow-unaltered rivers where increasing water use for development is occurring. The largest program is attempting to deliver environmental flows to 27 major river systems within the Murray–Darling Basin (MDB) in an effort to protect and restore their ecological health (http://www.mdba.gov.au/what-we-do/basin-plan, MDBA 2010). During the development of this major restoration program, managers have encountered three significant issues. Firstly, while there has been an increased focus on understanding the water needs of key aquatic biota and ecosystem function in recent years, there remains a lack of ecological knowledge in many areas, and as such, critical decisions are often made with relatively weak ecological evidence to support them. Secondly, to avoid significant adverse economic and social impacts, not all environmental targets are likely to be met. Not achieving all environmental targets may mean that for some species, river reaches or indeed whole catchments there will be no improvement and biota may continue to decline. Thirdly, these two issues have led to increasing skepticism among stakeholders about the ecological benefits that can be achieved by environmental flows. Indeed, the use of water for environmental purposes is undergoing increasing scrutiny worldwide (Poff et al. 2003), and hence the importance of understanding the ecological responses to environmental flows is increasing.

Three main types of environmental flow monitoring programs are currently employed: (1) ‘condition or program level monitoring’—assessing ecosystem or population changes over large spatial and temporal scales and identifies trends at the longer term. As program-level monitoring incorporates multiple interacting factors (e.g., land use, climate change), it is difficult to attribute ecological change due to flow change; (2) ‘compliance or operational monitoring’—assessing whether the water delivery targets are met (e.g., volume of water delivered to a wetland); and (3) ‘intervention monitoring’—assessing ecosystem or population changes in response to a specific intervention (i.e., a single managed flow). In general, intervention monitoring occurs over small spatiotemporal scales; however, long-term responses may be monitored (Gawne et al. 2013). While all three monitoring types inform environmental flow management, correctly applied intervention monitoring represents the strongest inference linking ecological response to flow change. Importantly, intervention monitoring underpins environmental flow reporting on outcomes, improved decision making, refinement of future environmental flow events and future monitoring through the adaptive management process.

A great deal has been published on ecological monitoring (e.g., Lindenmeyer and Likens 2010) and monitoring river restoration (e.g., Downes et al. 2002), including monitoring designs for environmental flows (Cottingham et al. 2005; Souchon et al. 2008; Gawne et al. 2013). Despite these studies, many environmental flow monitoring programs are poorly designed (Bernhardt et al. 2005; Kondolf et al. 2007; Webb et al. 2010) due to limited resources or a lack of proper evaluation and refinement as the study progresses (Alexander and Allen 2007; Konrad et al. 2011). In other instances, poor design is due to the challenges associated with environmental flow monitoring, such as identifying reference and control sites (Downes et al. 2002), the application and conceptualization of ecological knowledge (Lancaster and Downes 2010), or a focus on responses to individual events, which make both inferring longer-term responses and generalizing outcomes difficult (Konrad et al. 2011). Poor reporting (Kondolf et al. 2007; Konrad et al. 2011) also makes it difficult to compare results among studies, limiting the scientific and management advancement of the discipline (Poff et al. 2003).

Souchon et al. (2008) proposed a general monitoring framework for detecting biological responses to flow management, which focused on detecting responses related to changes in habitat. While the framework remains valid, management is increasingly seeking robust and multipurpose monitoring that can demonstrate both immediate outcomes and improve conceptual models and future environmental flow management. Souchon et al. (2008) also acknowledged that there are challenges associated with their framework, particularly in situations where managers and stakeholders seek outcomes beyond a habitat quality or availability; for example, flow triggers for key biotic processes and flow thresholds for system connectivity/ecosystem productivity. Our paper contributes to the advancement of environmental flow monitoring, by providing practical recommendations that aim to improve the scientific robustness and relevancy of environmental flow intervention monitoring programs to managers and policy officers. The recommendations are built on recent literature and our experience gained from working with stakeholders and managers to design, implement and monitor a range of environmental flow types in Australia. While we use recent literature and our combined experience in this rapidly developing area of environmental flow science to advance and strengthen some of the monitoring design steps proposed in Souchon et al. (2008), we also highlight the limitations of some current approaches, and propose new recommendations for the design, analysis and interpretation of future environmental flow monitoring programs.

Recommendation 1: Environmental Flow Monitoring Programs Should be Implemented Within an Adaptive Management Framework

Monitoring an ecosystem’s response to a management intervention is a key component of the Adaptive Management (AM) cycle (Nyberg 1998; Lindenmayer and Likens 2009). Intervention monitoring is likely to be more effective if it is developed and implemented within the context of the AM program, as it provides a foundation for development of explicit objectives (Olden et al. 2014) and collation of the environmental information required to design the intervention. Further, intervention monitoring that is undertaken within an AM cycle is of great value to management agencies as it supports good public sector governance; facilitating accountability, transparency and efficiency in decision making and also supporting credible communication of the benefits of watering to the broader community. Intervention monitoring within an AM cycle also improves our understanding of the system and its response to an intervention; thereby improving the capacity to predict outcomes and improve the effectiveness of future interventions.

Monitoring of environmental flows within an AM framework ensures a cycle of continuous improvement in the investment strategies and practices of natural resource management (Souchon et al. 2008). Involving both managers and scientists in the AM process also allows programs to re-assess and make relevant changes while the project is on-going (see Souchon et al. 2008; King et al. 2010). Importantly, involving scientists and managers throughout the project allows modifications to be tailored to interventions to accommodate operational needs that may arise (e.g., unexpected flooding of an area). Timely monitoring can also identify undesirable responses or when expected responses have not been met such that a subsequent change or flexibility in flow delivery could be implemented.

Within the MDB, some of the most effective intervention monitoring activities have been undertaken within an AM framework in which scientists have worked closely with managers in the implementation, monitoring, and evaluation of environmental flows. Examples of effective intervention monitoring programs include, monitoring alternative management strategies for blue-green algal blooms (Webster et al. 2000), releases from dams to disturb biofilm accumulations (Watts et al. 2010), pulsed flows to stimulate native fish breeding (King et al. 2009; 2010) and flows to sustain waterbird breeding (Kingsford and Auld 2005; Brandis et al. 2011). However, while monitoring of environmental flows is common in the MDB, few managers believe that monitoring data are being used adaptively or being fed back into management models (Meredith and Beesley 2009). The incorporation of monitoring data into management is a challenge for both scientists and managers, with similar issues being found by reviews of monitoring programs of other river restoration activities (Bernhardt et al. 2005; Brooks and Lake 2007).

Recommendation 2: Objectives of Environmental Flow Programs Should be Well Defined, Attainable, and Based on an Agreed Conceptual Understanding of the System

Vague or poorly specified program objectives are a major problem of many biodiversity monitoring programs (Lindenmayer et al. 2012). The development of an environmental monitoring program first requires clear articulation of the vision and objectives of the overall environmental flow program. Once these are established, they inform sequential objectives and the monitoring program itself, including study design, methodology and indicator selection (Lindenmayer et al. 2012).

The overall objective of environmental flow programs varies depending on the context (Arthington et al. 2010). For example, in flow-altered systems, environmental flows aim to restore key components of the flow regime that have been affected by river regulation or water extraction (e.g., floods, base flows), with the aim of improving ecosystem condition or protecting it from further degradation (Fig. 1a). In contrast, in rivers that have not been significantly affected by flow alteration, but face degradation through increased anthropogenic water use, environmental flows aim to avoid the loss of key components of the flow regime, thereby protecting ecosystem condition (Fig. 1b). The program vision and objectives of both types of environmental flows are clearly different, with one primarily targeting restoration and the second targeting conservation.

Fig. 1
figure 1

Conceptual diagram showing the two types of environmental flows: a in flow-altered rivers environmental flows aim to restore key components of the flow regime with the aim of improving ecosystem condition or protecting it from further degradation, or b in flow-unaltered rivers, but facing potential degradation through future anthropogenic water use, environmental flows aim to avoid the loss of key components of the flow regime, thereby maintaining ecosystem condition

The objectives of environmental flow programs exist within a nested hierarchy of objectives (sensu Kingsford et al. 2011), where the highest order objective is broad and subsequent objectives (or targets) moving down the hierarchy become more specific (Fig. 2). While the development of these steps starts at the top and works progressively down to finer-scale and more refined statements, the outcomes from the measurement of the performance indicators at the base leads to a progressive assessment of each of the higher-level objectives. In general at the largest spatial scale, there is an overarching high-level program objective (e.g., healthy river-floodplain ecosystem), which provides the overall context for the identification of desirable system values or characteristics which can be framed as subsidiary objectives at specific scales (e.g., to sustain wetland health).

Fig. 2
figure 2

An example of a hierarchy of objectives (sensu Kingsford et al. 2011) for environmental flow programs. Each step down the hierarchy is nested within the previous level. Program objectives aim to inform and meet program vision. Achievement of program objectives is dependent on longer-term implementation of a flow regime (darker gray region, right hand side) with specific targets and indicators. Program objectives will be met by achievement of many individual environmental flow events that also require targets and performance indicators that are nested toward the program objectives (lighter gray region, left hand side)

The process of developing a program’s vision and hierarchical objectives is significantly improved if it is undertaken in consultation with a wide variety of stakeholders. The objective hierarchy is then inherently built on societal values, judgments on trade-offs across stakeholders and current ecosystem understanding (Kingsford et al. 2011; Lindenmayer et al. 2012). The broader the engagement, the more the objective hierarchy will align with society’s expectations and the more widely accepted the restoration or conservation program will be (Gross 2003). Development of the hierarchy of objectives is also often made easier by developing a conceptual model of how the system works. Conceptual models describe our current understanding of system processes and dynamics, and describe the linkages or relationships between activities and ecosystem responses, and can also be a means by which stakeholders develop a common understanding of the system (Gross 2003; Stewardson and Webb 2010). A sound conceptual model of the system also helps to identify which elements of the ecosystem are likely to respond to an intervention, and therefore assists in indicator selection and monitoring program design.

Recommendation 3: Program and Intervention Targets Should be Attainable, Measurable, and Inform Program Objectives

For environmental flow programs, targets can be divided into two types: program targets or intervention targets (Fig. 2). Program targets are aimed at the longer-term objectives of the environmental flow program or flow regime being implemented and feed directly into development of the ‘condition’ monitoring program. Intervention targets (sometimes referred to as monitoring endpoints) are most often applied to specific individual watering or flow events, are generally short term, and inform the development of the ‘intervention’ monitoring program. It is then expected that achieving the short-term intervention targets will contribute to achievement of program objectives (hierarchical objectives). Achievement of program objectives are therefore dependent on longer-term implementation of a flow regime (where specific targets and performance indicators are developed) and by the success of many individual flow events (intervention activities), again where targets and performance indicators are developed and are nested upward into the program objectives.

Considerable work has been undertaken on the development of targets in recognition of their importance to natural resource management. A target and its associated performance indicators represent a quantifiable or measurable entity, whose attainment would indicate achievement of a higher-order objective. Targets that are designed to be SMART (Specific, Measurable, Attainable, Relevant, and Time-bound) are particularly useful, as they both influence the design of the monitoring program and improve the ability of the management intervention to be successfully evaluated (McDonald-Madden et al. 2010). However, SMART targets are also difficult in reality to set, as they require an understanding of the expected bounds or confidence limits of the target response.

Target setting is a two-step process: firstly, information about the system is gathered to describe the relationship between current ecological condition (as measured by appropriate indicators) and higher program objectives (e.g., number of individuals in a population and the probability of extinction or the number of species in a community and their likelihood of sustaining species diversity); and secondly a judgment is made about which target is most appropriate (Downes et al. 2002). There are several potential sources of information about the system that should be considered when deciding on what target condition should be aimed for, and all are not necessarily mutually exclusive. Natural condition is seldom a feasible target (Mao and Richards 2012) and is often poorly understood in modified ecosystems (Ramsar 2012). The condition of the system at a previous point in time as determined using historical information can also be used and is sometimes better than natural condition, because more information may be available and there may be some estimates of the extent of its variability (through space and time). Alternatively, sites in good condition can be used as a reference and set as the target, as adopted for some macroinvertebrate assessment programs (e.g., AUSRIVAS, RivPACs (Davies 2000; Simpson and Norris 2000)). Another approach is to identify key thresholds that can represent either a condition to achieve or to avoid (Kingsford et al. 2011). Finally, models such as population viability models (Shenton et al. 2012) or bayesian belief models (Gawne et al. 2012) may be used to inform target development. It is more difficult to develop SMART targets for large-scale generic goals (e.g., Healthy Working River) due to the limited knowledge of the relationship between ecosystem condition and values at such large scales or the historic ecosystem condition and variation of parameters.

Once a SMART target has been developed for environmental condition, specific flow requirements that are needed to sustain the system in that condition can then be identified; for example by selecting elements of the natural flow regime to preserve (Poff et al. 1997; Richter et al. 1997), habitat availability methods (Manly et al. 2002) or using known relationships of ecological responses to flow alteration (Poff et al. 2010). The ecological outcome of any environmental flow, is complex. The outcome will result from the interaction between the characteristics of the flow event, the character of the system (ecosystem type or river classification (e.g., Poff et al. 2010)), the species responding and their specific biological requirements, and the condition of the ecosystem—which is in part a the product of antecedent flows (Balcombe et al. 2012; Beesley et al. 2014b). For example, flow characteristics such as timing and flood duration will influence the success of bird and fish breeding events (King et al. 2009; Arthur et al. 2012). Wetland characteristics have also been shown to influence the emergent plant community in response to flooding (Barrett et al. 2010) or how organic matter accumulations on floodplains influence the likelihood of anoxic blackwater as a result of warm water flooding (Howitt et al. 2007). This complexity means that our capacity to predict the outcomes of specific flow events will always be limited, and can only be improved by increasing knowledge (Hughes et al. 2005; Harris and Heathwaite 2012) thus emphasizing the importance of AM.

The progress of the restoration trajectory toward a program objective may not be positive at all times or linear, and hence poses additional difficulties for target setting. For example, change may not occur unless flows preceding the environmental flow are suitable, or the condition of the system may appear to decline as it undergoes a transition from one state to another. Systems being targeted for restoration that decline in condition are of obvious concern, as ideally effective restoration should not harm the system (Palmer et al. 2005). However, Jansson et al. (2005) have suggested that sometimes river health may need to go “backward” in order to eventually achieve a program target. For example, restoring flooding after prolonged drought conditions can cause hypoxic blackwater events and associated crayfish and fish kills (King et al. 2012). One approach that may facilitate setting targets for environmental flow events is to consider the restoration or conservation trajectory (e.g., Lake et al. 2007). State and transition models (Rumpff et al. 2011) could be used to facilitate the identification of a target for each successive flow event. While this approach has some significant benefits, it will not fully resolve the tension between the need to develop tightly defined SMART targets and recognition of the inherent ecosystem complexity. Explicitly acknowledging that a range of outcomes are within the bounds of acceptable or predicted outcomes would, therefore, be required, in which case evaluation and interpretations are more difficult, and adaptive monitoring approaches would be required (Lindenmayer and Likens 2009).

Recommendation 4: Intervention Monitoring Programs Should be Designed to Improve Our Understanding of Flow–Ecological Responses and Related Conceptual Models

Intervention targets directly inform the development of monitoring objectives, the hypotheses to be tested and the monitoring program design. To ensure that the monitoring program demonstrates achievement of the target or improves our understanding of the system, it is essential that the objectives are linked conceptually to the intervention target. Linking the objectives to the target can be best achieved by development of an agreed conceptual model of the system and the flow-ecological response relationships (Poff et al. 2010; Kingsford et al. 2011). Efficient monitoring also requires hypotheses to be developed that test linkages in the flow-ecological response relationships developed in the underlying conceptual model. Monitoring and research allow us to develop relationships (both mathematical and conceptual models) that describe system functioning and therefore enhance our capacity to infer or predict the outcomes of future management actions. To date, most environmental flow monitoring has largely focused on reporting outcomes, but for the discipline to advance and restoration to succeed, we suggest that scientists and managers need to focus much more on the development of robust flow–response relationships and the underlying conceptual models which support the environmental flow objectives. The refinement of flow–response relationships may be best achieved through targeted research activities, rather than just monitoring specific management actions or system condition.

The objective of the intervention monitoring program may be to: (1) test whether the intervention achieves its target—i.e., “demonstrating a response”; or (2) generate information that will support future decisions—i.e., “improving a response”—where monitoring is conducted, not only to determine whether the intervention has achieved its target, but also to learn about the causal mechanisms and gradation of the responses. The knowledge generated from this second type of monitoring helps improve the effectiveness of future interventions and sits comfortably within the AM cycle. While there is increasing recognition of the need to conduct monitoring that leads to improved watering outcomes, limited resources often mean that monitoring is restricted to the first type only. We contest, given the paucity of environmental flow science, the first monitoring type—only demonstrating a response—represents a false economy. The overall benefit from improved knowledge about mechanisms or thresholds that lead to improved model refinement and improved flow outcomes, should easily justify the additional investment.

Recommendation 5: Indicator Selection Should be Based on Conceptual Models, Objectives, and Prioritization Approaches

An indicator is ‘a characteristic of the environment which, when measured, quantifies the magnitude of stress, habitat characteristics, degree of exposure to the stressor, or degree of ecological response to the exposure’ (Hunsaker et al. 1990) and provides information on the system’s condition. A well-constructed and scientifically supported conceptual model provides a scientific framework for the development of robust objectives and targets, and assists in choosing appropriate indicators. Conceptual models are fundamental to the success of environmental programs, as they provide an integration of system understanding and identification of the complex interactions and relationships between ecological parameters, ecological states and processes (Gross 2003). A conceptual model is also a useful communication tool that can be used to explore and explain complex interactions and processes to a wide audience (Gross 2003). The development of a conceptual model is a useful step in identifying links between program objectives and individual flow management objectives, and can be used to identify those parameters likely to respond to flow that are relevant to the objectives being considered (i.e., indicators).

The conceptual modeling process is likely to highlight a number of potential indicators that could be or should be included in a monitoring program. Consequently, it will be necessary to prioritize indicators using appropriate criteria (see for example Cairns et al. 1993; Downes et al. 2002). Prioritization criteria can be grouped into seven broad themes:

  1. a)

    Scientific. Analytically sound, credible, integrative, of general importance to ecosystem function

  2. b)

    Historic. It has an existing historical record, reliability/proven track record.

  3. c)

    Systematic. Predictable, pre-emptive, time-bound (within policy time frames).

  4. d)

    Intrinsic. Measurable, portable, specific, having statistical properties that allow unambiguous interpretation; applicable to many areas, situations, and scales.

  5. e)

    Practical. Cost-effective, achievable in terms of resource and time demands, not requiring excessive technical expertise.

  6. f)

    Management. Comprehensive and relevant to current management and target audience; has well-established links with management practices, actions, and policy targets; thresholds can be identified and used to determine when to take action.

  7. g)

    Value. Social, conservation, economic, or cultural value.

A variation to the approach for indicator prioritization integrates the development of a hierarchy of objectives with the identification of indicators (Kingsford et al. 2011; similar to the hierarchy in Fig. 2). Each level in a hierarchy of objectives is developed in a similar manner to a conceptual model, with the next level down in the hierarchy representing the answer to questions posed in the higher level, which might include: how is the objective manifested at the next smaller spatial or temporal scale? What are the key ecological influences on the objective? The questions that are asked may vary from level to level in the development of the hierarchy.

The advantage of developing a hierarchy of objectives is that it enables application of some of the indicator selection criteria in a step-wise fashion, clarifying the logic and illustrating the process. Each step in the hierarchy further defines the list of potential indicators. For example, the ecosystem objective of ‘maintain or restore ecosystem biodiversity, functional diversity and ecology within thresholds of natural variability’ provides broad guidance for the types of parameters that may be considered (Kingsford et al. 2011), but is too broad to define a SMART target and select suitable indicators for monitoring. In our example hierarchy in Fig. 2, a broad program objective of, for example, sustaining wetland health, identifies potential indicators, one of which is “Successful breeding and fledging of Nankeen night herons”; and this can then be developed into a suitable performance indicator for the intervention monitoring program: for example “Greater than 20 Nankeen night heron nesting attempts.” These would also need to be made into relevant time-bound statements for each event or location. The development of a scientifically rigorous objective hierarchy and conceptual model, developed with key stakeholders, ensures that the final monitoring indicators selected are scientifically justified, valid, widely accepted and inform higher objectives.

Regardless of the approach, selecting indicators that support an evaluation of the effectiveness of an environmental flow remains a challenge due to limited knowledge of ecological systems and their responses to flow. Currently only, a few parameters have either been causally linked to flow regime changes or shown to respond in a predictable manner to specific flow events (Reid and Brooks 2000; King et al. 2003; Lloyd et al. 2004; Poff and Zimmerman 2010; Gillespie et al. 2014). For example, only recently have studies in Australia demonstrated some correlative links between various watering attributes or flow components and fish responses (e.g., Balcombe et al. 2006; King et al. 2009; Zampatti and Leigh 2013; Beesley et al. 2014a, b; but also see Bradford et al. 2011). Considering the infancy of knowledge around ecological-flow relationships, we suggest that where possible monitoring programs should also consider new parameters that have strong theoretical support but limited empirical data and test their utility.

Recommendation 6: Appropriate Monitoring Designs and Statistical Tools Should be Used to Measure and Determine Ecological Response

The usefulness of monitoring outcomes for informing management can often be limited due to a lack of accuracy and precision in the data collected (Bearlin et al. 2002). Typically, this lack of accuracy and precision arises due to error in the sampling process in two ways: observational or methodological errors (Yoccoz et al. 2001). Observational errors occur because we can rarely observe the entire system of interest and must rely on a sampling design that may contain both spatial and temporal components to draw inference about the entire system (Yoccoz et al. 2001). Observation errors can become problematic if they are not recognized and corrected for, as they can lead to incorrect inferences about the data. Sampling error arises when the spatial and temporal arrangement of sample units, and the number of samples collected, fails to accurately and precisely describe the true state of the system. Improvement in statistical power can occur by increasing the number of sites sampled or by distributing sampling effort disproportionately, either by increasing the sampling effort in areas where the variance of the data is high or by limiting the variance explained by using a stratified sampling design (Cochran 1946; Krebs 1989). Power analyses using prior data can also help estimate sample size requirements and can be used to determine the relative performance of various sampling designs (Gerow 2007). The statistical design and power of the monitoring program is therefore critical (Krebs 1989, Anderson 2001), and should be considered and discussed with relevant experts at the outset and throughout the duration of the monitoring program.

Methodological issues occur because sampling efficiency for the biota can vary with flow or related characters, masking true relationships and increasing the risk of spurious conclusions (Archaux et al. 2012). For example, characteristics of the environment that are related to flow, such as substrate, water velocity, water depth, water clarity, and discharge can affect the sampling efficiency (Korman et al. 2002; Stone 2010; Wisniewski et al. 2013). Two main approaches could be employed to reduce the effects of variable sampling efficiency: (1) standardizing sampling methods, and/or (2) estimating the detection probability directly in the sampling design and analysis. Standardizing sampling methodology by fixing the procedures and techniques of data collection is an attempt to keep the error constant through space and time, and is a simple approach commonly employed by large-scale monitoring programs (e.g., Davies et al. 2010; Simpson and Norris 2000). Standardization works well when the variation in the metric due to variable sampling efficiency is small and predictable relative to the variation in the ecological process of interest (Johnson 2007). One major drawback of standardization is that variation in sampling efficiency will remain unknown, and this affects evaluation of the efficacy of the data for inferring flow effects. We, therefore, recommend that monitoring should also account for incomplete detection when possible (Yoccoz et al. 2001). Accounting for incomplete detection can be achieved by a range of approaches using mark-recapture, depletion trials, and occupancy and mixture models (e.g., Jolly 1982; Buckland et al. 1993; Gould and Pollock 1997; MacKenzie and Kendall 2002; Royle 2004). While accounting for variable detection is not as common in studies in aquatic systems as for terrestrial systems, some recent examples are emerging (Coggins et al. 2006; Bradford et al. 2011; Gerig et al. 2014; Lyon et al. 2014).

Although the methods outlined above can be effective at estimating the detection probability and accounting for variable sampling efficiency, they can also be costly. Recently, alternative approaches for accounting for imperfect detection have been developed that only rely on sampling at spatially replicated sites (MacKenzie and Kendall 2002; Tyre et al. 2003). These methods tend to be less costly and can be used to estimate patterns in site occupancy (MacKenzie and Kendall 2002; Tyre et al. 2003) and abundance (Royle and Nichols 2003; Royle 2004) while accounting for incomplete detection (see, e.g., Wisniewski et al. 2013; Beesley et al. 2014b).

Recommendation 7: Responses Should be Measured Within Timeframes that are Relevant to the Indicator(s)

Decisions about when to monitor should consider not only the monitoring program objective and the management reporting needs, but also the likely response time of the chosen indicators (Souchon et al. 2008; Beesley et al. 2012, 2014a). For example, water quality changes generally occur rapidly (Tate et al. 1999), whereas changes in fish populations or assemblages will occur over much longer timeframes (Beesley et al. 2014a, b). Species life-spans and life history preferences are important considerations for biota. Short-lived species with high recruitment outputs are likely to respond faster and are therefore more suited to short-term monitoring, than longer-lived species (Souchon et al. 2008). For example, Robinson et al. (2003) suggested that the response of macroinvertebrate assemblages to regular experimental flooding regime is likely to occur over years rather than months, as the species composition adjusts to the new and more variable habitat template. Identification of the appropriate monitoring time scale that incorporates the appropriate response time can be difficult if the indicator is poorly understood, and a pilot program could be beneficial in this case.

To increase the strength of causal inference, samples should be collected prior to the intervention and after the intervention using a Before-After monitoring design, at a frequency and duration that allows the indicator to respond; but not so long that other influences start to affect indicator response. To some extent, the timing, duration and frequency of the ‘after’ sampling event will depend on the parameter of interest. In an intervention designed to increase fish spawning and recruitment, sampling would need to be undertaken with sufficient frequency to detect spawning events. For example, to assess fish spawning and relative recruitment in the Murray River in relation to flow, King et al. (2009, 2010) sampled spawning at fortnightly intervals over a 6-month time period for several years, and sampled the number of recruits over 4 months post each spawning season. In contrast, if the monitoring objective was to detect changes in the diversity of the fish community after watering, sampling would not need to be so frequent. For example, Valentine-Rose and Layman (2011) chose to sample annually to obtain an annual census of fish densities, when assessing the effectiveness of restoration measures on ecosystem structure and function in mangrove wetlands. However, this approach is likely to be more susceptible to the influence of confounding factors, and inferring that a response due only to the watering intervention would be difficult.

Collecting suitable ‘before’ data is also not always possible and compromises in the experimental design must often be made. For example, sampling wetlands prior to watering interventions is not always practical as they may be dry, or because water may be delivered before appropriate ‘before’ data can be collected (Beesley et al. 2014a). To address this lack of ‘before’ data, Beesley et al. (2014a) considered the likely response times of the key fish species and employed a strategy of sampling population abundance at three defined intervals after the watering event;

  • Time 1: 1–2 weeks post flooding. This was chosen (1) to enable both dry and wet wetlands to be treated in an equivalent manner, and (2) as a short enough time period that fish would not have had time to respond in any other way except to move into the wetland,

  • Time 2: 6 weeks post flooding. Chosen to represent the difference between the 1–2 weeks’ post flooding and 6 weeks’ post flooding, where the data were used to describe the short-term fish response to watering, and,

  • Time 3: end-of-spawning season. This was chosen to represent the difference between 1–2 weeks’ post flooding and end-of-spawning season, and the subsequent data were used to describe the spawning season response.

Recommendation 8: Watering Events Should be Treated as Replicates of a Larger Experiment

If the objective is to simply demonstrate a response to an individual environmental flow event, then a before–after-control-impact design will have strong inferences, but may be thwarted by the difficulty of identifying suitable ‘before’ and ‘control’ sites (Downes et al. 2002; Growns 2004; Chee et al. 2009). Gillespie et al. (2014) reviewed environmental flow studies world-wide and found that only <20 % used control sites, with considerable variation between the type of control site used. Gillespie et al. (2014) proposed that the most probable effective control site is one that is independent and regulated, but suggested that further research is required to test this hypothesis. If, however, the monitoring objective sits within an AM framework, then outcomes of previous monitoring activities can be combined and may be used to improve our understanding of the system. Combining monitoring outcomes of different watering events treats multiple watering events as replicates of a larger experiment based around an underlying conceptual model in an AM framework (see Fig. 3). The capacity to assess interacting or confounding factors is one of the main strengths of combining the data from many individual watering events. It may also assist in teasing out the influence of antecedent conditions (flow events or ecological condition), prior to an environmental flow on ecological responses (see for example Beesley et al. 2014b).

Fig. 3
figure 3

Examples of biotic response–flow characteristic relationships using multiple watering events (A–H). Note that while the biotic response does not change, the distribution across the gradient of each flow characteristic (x-variable) does change

The benefit of a multi-intervention approach is that it allows ecological response relationships to be developed. For example, we can determine if flow causes a threshold response (a threshold needs to be reached before improvement), linear response (continuing response improvement), or asymptotic response relationship (i.e., after a threshold, no improvement is seen). In this way, a ‘library’ of responses linked to environmental parameters is assembled, and provides the opportunity to analyze responses along gradients of flow characteristic, ecosystem character or condition and to describe interactions. Using a multi-intervention approach would also aid in reducing statistical uncertainty and increase precision and confidence in the responses observed (Gillespie et al. 2014). Initially, the approach would require many events to be monitored, to build confident relationships. However, we do not suggest that all events be monitored, and over time, the number of events that are monitored could be reduced if there was sufficient information about that type of watering event. In addition, monitoring every event is unlikely to generate the best return on investment. Priority events should be those that collect data that inform a critical flow relationship (i.e., a priority gap in knowledge), and those that are likely to yield the best scientific information: for example, where confounding factors are at a minimum, sampling is feasible and accurate; and replication and statistical power are adequate.

Recommendation 9: Environmental Flow Outcomes Should be Reported Using a Standard Suite of Metadata

A major impediment to achieving a multi-intervention approach is the variation in the reporting of environmental flow descriptions, key parameters examined and their ecological responses to the intervention. A simple, yet fundamental advance in environmental flow science would be for all monitoring programs to report on a standard suite of data or metadata to describe the response of indicators to the environmental flow. The metadata would include descriptions of the overall objectives and targets of the program and watering event, the type and characteristics of the environmental flow to be delivered, the nature and condition of the target system and the response of variables measured (see for example Table 1). This dataset should also list all indicators monitored, irrespective of the response outcome. These could be incorporated into a central database maintained by an agreed authority, and over time, would provide the ability to interrogate the database to investigate the response of specific indicators to multiple interventions or environmental flows. The regulatory or management agencies creating and maintaining these databases could require a standardized protocol to record where and how environmental flows were implemented and the outcomes achieved. Collecting and managing metadata has the advantage of not relying on institutional or individual knowledge of past environmental flows and their outcomes.

Table 1 Examples of the metadata which should be reported for all environmental flow events

Conclusion

Around the world, rivers are degraded, but with limited water and financial resources available for their restoration, there is an increasing expectation of accountability and transparency in the implementation of environmental flows. As a consequence, it is critical that we rigorously appraise the efficacy of environmental flow interventions. However, an effective appraisal relies on well-designed monitoring programs. In recognition of the need for continued improvement in how we monitor so that we can better learn, this paper provides a series of recommendations based on the recent literature and our combined experiences of environmental flow monitoring.

The recommendations highlight the need to embed monitoring programs within a clearly defined environmental flow program that has a strong conceptual underpinning. It also highlights the inherent need for monitoring programs to be targeted at improving our understanding of flow ecology relationships and testing underlying conceptual models, thus contributing to future improvement of the effectiveness of future environmental flows. Building on previous monitoring frameworks (Souchon et al. 2008), the recommendations acknowledge the importance of the environmental and managerial hierarchies in which intervention monitoring is imbedded and also propose specific design principles that will help identify ecosystem responses to environmental flows within the variation inherent in aquatic ecosystems. To address these issues, we also propose that environmental flow events be assessed as a collective where it is possible, by treating individual flow events as replicates in a larger experiment. Using a multi-intervention approach is advantageous because it allows us to increase our sample size, and hence, our capacity to detect flow-biota relationships, assess interacting factors, test models (conceptual and quantitative), and tease out the influence of antecedent conditions.

A multi-intervention approach, though, is not without cost, as it increases the need for the environmental flow program to have standard methods, appropriate quality control, and data management. In instances where multiple institutions are involved, there will be an increased need for close collaboration among the management institutions and monitoring service providers, including a process for developing standard methods, data sharing, and coordination protocols. Improving institutional arrangements will lead to more cost-effective monitoring programs, a rapid improvement in our understanding of ecological responses to watering, and ultimately to improved outcomes from environmental flow restoration.