Introduction

Prior to the 1990s, river health assessment mainly relied on water quality measures; however, in more recent times, assessment programs have focused on the direct measurement of characteristics of the biota (mainly benthic macroinvertebrates, algae, vegetation, and fish) or ecosystem processes (e.g., Karr 1981; Fausch et al. 1984; Metcalfe-Smith 1996; Bunn et al. 2010; Angradi et al. 2011). Holistic multimetric methods combine indicators that represent the biological, chemical, and physical aspects of ecosystems (e.g., Ladson et al. 1999; Parsons et al. 2002; Zhao et al. 2005; Bunn et al. 2010; Davies et al. 2010). This approach expresses river health as a string of sub-indicator values or as a single integrated index without attempting to separate cause (abiotic drivers, or pressures) from effect (biotic response) variables. Including driver/pressure variables enhances the capacity of the assessment program to diagnose the cause of river health problems, identify issues or locations that require management intervention, and evaluate the relative success of management actions that have been implemented.

While assessment of river health is now a relatively routine practice in many parts of the world, most of the focus has been on wadeable streams rather than large lowland rivers (i.e., low-gradient, high-order rivers that are not wadeable) (Johnson et al. 1995). The most important issue in the assessment of the health of large rivers is the determination of the reference state. In the context of river health assessment, the term “reference” refers to the benchmark against which the current condition of the river is assessed. Stoddard et al. (2006) defined five different interpretations of the reference condition used in river health assessment: (1) the reference condition for biological integrity, RC(BI), or unimpaired by human disturbance; (2) the minimally disturbed condition, MDC, where a low level of human disturbance has had little impact on river condition; (3) the historical condition, HC, or the condition of the river at a specified time in history; (4) the least-disturbed condition, LDC, or the best condition that currently exists in the region; and (5) the best attainable condition, BAC, or expected condition for the given human use of the river if best management practices are followed.

Some aspects of large lowland rivers present a challenge when attempting to establish a reference in river health assessment. Most large lowland rivers of the world have been impacted by human disturbance for a considerable length of time (Weigel and Robertson 2007) and therefore lack the data needed to establish RC(BI) or MDC. Large lowland rivers tend to be regionally unique, so even if undisturbed lowland rivers were to be found, they would not necessarily be a valid reference for other rivers. While it is possible to derive RC(BI) for rivers lacking data through modeling or expert opinion (e.g., Wessell et al. 2008; Angradi et al. 2009; Davies et al. 2010), this approach would not be without risks. A large lowland river is typically developed along its entire length, so LDC would generally represent a considerably impaired state and therefore lead to overly optimistic assessments of river health (Wessell et al. 2008). Thus, in large lowland river settings, HC or BAC could be a pragmatic choice of reference.

The Yellow River, which is among the world’s largest rivers, is vulnerable to environmental degradation and water scarcity (Varis et al. 2012). There is a general recognition that the ecological health of the lower Yellow River reached its nadir in the late 1990s, when the river suffered long periods of drying (Liu et al. 2006). Although frameworks for assessing the health of the Yellow River have previously been suggested (Zhao et al. 2005; Liu et al. 2006), and a composite index of health has been proposed (Ni and Liu 2006), the biological health of the river has not previously been scored. The relative biological health of the lower Yellow River is of interest because the managing authority, the Yellow River Conservancy Commission (YRCC), has made significant efforts since 1999 to improve river health by improving water quality, scouring the aggraded channel bed, and introducing environmental flows (Gippel et al. 2012). These measures were facilitated by the construction of the Xiaolangdi Dam, which has allowed greater control over sediment and flow in the lower Yellow River.

This paper describes an approach to biological health assessment in a heavily utilized, large lowland river setting, using the example of the lower Yellow River, China. The main objectives of this work are to explain the approach taken to define biological reference conditions, to describe the metrics and weightings used to score river health, and to report on the health of the lower Yellow River and its implications for management.

Materials and methods

Study area

The Yellow River is 5464 km long with a basin area of 752,443 km2. The watershed area is as large as 794,712 km2 if the inner flow area in Erdos is included (Fu et al. 2004). The Yellow River basin is traditionally divided into the upper, middle, and lower reaches (Yu 2006; Ma et al. 2010) (Fig. 1). The basin is mostly arid and semi-arid land. In the middle basin, the river cuts through a loess mantle that suffers severe soil erosion (Wu et al. 2004). The enormous amounts of sediment delivered from this area have created a suspended floodplain within the diked section that is 3–10 m higher than the land outside the dikes for most of the river’s course (Yu 2006; Wu et al. 2008). The sediment that has reached the sea has created a vast, morphologically and ecologically dynamic delta landform about 7500 km2 in area. Although the sediment load of the Yellow River has declined in recent times due to soil conservation works, sediment trapping by dams, and reduced discharge, it remains high by world standards (Wang et al. 2007). Before Sanmenxia Dam began operating in April 1957, the lower Yellow River was largely unregulated. Xiaolangdi Dam, completed in December 1999, is the most recent large dam. It is located at the upstream boundary of the lower Yellow River (Fig. 1). Flood dikes constructed along the length of the lower river have severed the natural hydrological connection between the river and its floodplain. The key location-based ecological assets of the lower Yellow River are the river channel itself, three riverine wetland reserves (Mengjin, Kaifeng, and Zhengzhou), a large lake (Dongpinghu), and the Yellow River Delta National Nature Reserve, which was Ramsar listed in 1994 and listed as a National Demonstration Reserve in 2006 (Fig. 1).

Fig. 1
figure 1

The lower Yellow River study area, showing the four reaches identified for river health assessment, and the key ecological assets

River reach division

Consideration of the geomorphological and hydrological characteristics and the locations of the key ecological assets of the lower Yellow River (Fig. 1) led us to assess four reaches separately. These reaches (numbered 1–4 from upstream to downstream) were based on the widely used geomorphological division attributed by Wu et al. (2005) to Qian et al. (1993), although here the boundary of the estuarine reach was shifted from Lijin, which is well beyond the reach of tidal effects or saline water, to a point 20 km upstream of the mouth (Fig. 1). Gippel et al. (2012) provided a detailed review of the literature describing the biological, physical, and chemical characteristics of the four reaches, including the ecological assets within the reaches. The lower Yellow River is highly modified and controlled, and from the perspective of broad physical, chemical, and biological characteristics, the reaches can be considered relatively homogeneous.

Biological indicator groups, sampling, and identification

Bioassessment of the lower Yellow River was based on assessments of fish, benthic macroinvertebrates, and riparian vegetation indicator groups, with each group comprising a number of indices. These three biological groups make important contributions to the structure and function of river ecosystems, and are among the most commonly used biological measures of river health (Rosenberg and Resh 1993; Naiman and Décamps 1997; Barbour et al. 1999; Verdonschot et al. 2000; Richardson et al. 2007).

Two biological surveys were carried out in the spring (April–June) and the autumn (September–October) of 2008. Three sites were sampled during the spring survey at Huayuankou (reach 1), Gaocun (reach 2), and Lijin (reach 4). Two additional sites, Aishan (reach 3) and the estuary (reach 4), were included in the autumn survey (Fig. 1). The estuary reach flows through the vast Yellow River delta, but our survey was restricted to the vicinity of the main channel. The sites were selected on the basis of access and reasonable proximity to a hydrological gauging station, and were intended as fixed locations at which re-surveying will be performed in the future. The sampled areas of each reach had a limited range of distinctive habitat types present. Reach 1 had two types: the main channel with a small area of low-lying riparian wetland, and a large area of wetland at Huayuankou. Reaches 2 and 3 had one type: the main channel with a small area of low-lying riparian wetland. Reach 4 had four types: the main channel, anabranches, large areas of wetlands, and permanent waterbodies. These habitat types are consistent with those used by Xu et al. (2004) and Wang et al. (2010). The gauging stations used to characterize the reach hydrology were Huayuankou (reach 1), Sunkou (reach 2), Luokou (reach 3), and Lijin (reach 4) (Fig. 1). The same locations were used to characterize reach water quality except for reach 2, which used data from Gaocun.

At each site, local fishers were employed to sample fish using their own gear, and were under instructions to collect as many species as possible. This is a common approach to fish sampling in China when the objective is to maximize the diversity of the catch. Sampling attempts by the field team always produced a catch of lower abundance and diversity than that of the local fishers. Fish samples were collected from a single sampling effort at each site on each sampling occasion using gill (2 or 3 cm mesh) and casting (4 cm mesh) nets, as described by Ru et al. (2010). The gill nets (100 m in length) were set in the shallow waters from dusk to dawn for 2–3 whole nights, and the casting nets were used along the river bank for 1–2 h in the afternoon. We used two different gears to increase the range of the population being sampled. All collected specimens were identified to species level according to taxonomic monographs (Chen et al. 1998; Chu et al. 1999; Yue et al. 2000; Zhu 1995). We also consulted the local fishers regarding species they had caught in the last few months. If a species had been found in the last few months but was not found in our surveys, it was added to our species list.

Benthic macroinvertebrates were collected using methods that are standard for large river channels in China, as described in Zhao (2010). Animals were collected from between three and six locations per site using a modified Peterson grab sampler (1/16 m2), and cleaned gently with a 425-μm sieve. Although this method is commonly used in deep water, we also used it for shallow shoreline habitats. The number of samples taken was proportional to the number of macroinvertebrate habitat types present, so more samples were taken in reach 1 than the other reaches. Shoreline habitats were also sampled by sweeping a D-frame dip net (425-μm mesh), and these samples were used to supplement the assessment of diversity. Animals were sorted in the laboratory and cleaned with distilled water prior to preservation in 10 % formalin. All specimens were counted and identified to species level, or to genus level if the species was indeterminate according to taxonomic monographs (Brinkhurst and Jamieson 1971; Epler 2001; Liu et al. 1979; Morse et al. 1994; Wang 2002), from which density (individuals/m2) and species richness data were derived. After drying the animals on absorbent paper, their wet weight was determined as an estimate of biomass. Biomass information was not used in this health assessment due to a lack of reference data, but it is reported here for comparison with future surveys. Samples of animals were also collected in the field by hand net as a qualitative supplement to the formally collected samples. Macroinvertebrates were also classified into functional feeding groups (FFGs) according to Morse et al. (1994), Liu (2006), and Zhao (2010).

Riparian plant data were collected by counting the total number of species observed over a single transect, beginning at the water’s edge and bearing perpendicular to the channel until essentially terrestrial vegetation was encountered. Sampling transects were up to 500 m long. Also, at each site, plant samples were collected from between 3 and 13 plots of 1 m2, located along the transect. The plots were selected to characterize riparian zones of distinctly different elevations, so the number of plots reflected the morphological complexity. The aerial portions of all plants in the plots were collected, identified to species level using standard keys (Fu 2002; Li and Xu 2005; Ma and Zhao 2007; Diao 1983; Zhao 2002), and their wet weight determined. Density and biomass information was not used in this health assessment due to a lack of reference data, but the information for grass species is reported here for comparison with future surveys.

Determination of the reference

Reference conditions for fish, macroinvertebrates, and plants were constructed on the basis of historical data, published and unpublished literature, local knowledge, and expert knowledge. Two major historical surveys of fish were conducted in the lower Yellow River in 1958 and 1980, from which only presence/absence information could be extracted. Information was gathered regarding the species caught locally by anglers, and the local species offered for sale in markets. In addition, local residents and fishery management organizations were consulted regarding local fishery information. A complete list of fish species that could potentially be present in the lower Yellow River was constructed using all available information, and then a list of expected species was created by excluding marine species and those known to be extinct before the 1950s. This list was then reviewed by two experts (Xiu-Qi Li, Freshwater Fisheries Research Institute of Shandong Province, Jinan and Hui-Jun Ru, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan). The final species list was assumed to indicate the maximum potential taxonomic diversity of fish from the 1950s.

Regarding benthic macroinvertebrates, the published studies were all from reach 4 (the estuary), and only a limited number of surveys were available (Zhang et al. 1990; Zheng et al. 2010; Jia and Tian 2003). Only presence/absence information could be extracted from available sources. The reference condition for macroinvertebrates was reconstructed assuming that species richness was proportional to the number of different habitat types available. Historical data from the delta were used to estimate the species richnesses of other reaches by comparing the availability of habitats in those reaches. For example, there were four types of habitats and 69 expected freshwater species in reach 4 (estuary) but only one habitat type in reach 2, so the expected species richness in reach 2 was estimated as 1/4 × 69 = 17. The coarseness of this reference estimation method is acknowledged, but was unavoidable given the lack of an alternative. This method was also used to estimate the expected density of macroinvertebrates based on the densities of macroinvertebrates found in similar habitats in other rivers and waterbodies in China (Zhao 2010). The macroinvertebrate reference data were assumed to represent the maximum potential taxonomic diversity that prevailed in the late 1980s to early 1990s. Our expected total species richness was 69 for the estuarine reach, with the other reaches having smaller totals. This is not inconsistent with the 74 taxa found by Zhao (2010) in a survey of 21 Yellow River sites covering the entire basin, and the 71 taxa found by the Cooperation Team of Survey on Fishery Resources of the Yellow River System (1986), in a survey between Lanzhou (upper Yellow River) and the delta.

Many plant surveys have been undertaken on the lower Yellow River, especially in the delta region. Of the published studies, we relied mainly on the detailed surveys of Duan et al. (2008) and Song (2005) for the upper part of the lower Yellow River, and Zhang et al. (2006) for the estuary. Recent studies on plant species richness by Zhang et al. (2010a, b) did not contradict the earlier surveys. The historical data were collected using limited information on species richness. The riparian plant reference data were assumed to represent the maximum potential taxonomic diversity that prevailed in the early 2000s.

Initial data processing

The data from the two 2008 surveys were combined to form a single set of data for each of the fish, macroinvertebrate, and plant indicator groups. A number of potential ecological indices were derived from each of these datasets. These indices were selected because they are commonly used in China. Using the fish presence/absence data, total species richness and the numbers of natives, exotics, migratory species, and species within different feeding and habitat guilds were determined for each site (Table 1). Abundance and body size data were unavailable for most fishes. From the macroinvertebrate data, the density, biomass, and the numbers of species within different taxonomic groups and FFGs were determined for each site (Table 2). If a species belonged to more than one FFG, its contribution to the indicator value (richness, density, or biomass) was divided equally among those FFGs. Also, from the macroinvertebrate data, the Shannon index H’ of species diversity (Spellerberg and Fedor 2003) (calculated using the formula \(\sum\nolimits_{i = 1}^{n} {p_{i} \log_{2} p_{i} }\), where p i is the proportion of individuals that belong to the ith taxa in the dataset and n is the number of taxa present) was determined for each site (Table 2). From the plant data, total species richness and the number of grass, woody, and salt-tolerant species were determined for each site, along with the density and biomass of grass species (Table 3).

Table 1 Observed and expected reference indicator values for fish in the four reaches of the lower Yellow River
Table 2 Observed and expected reference indicator values for macroinvertebrates in the four reaches of the lower Yellow River
Table 3 Observed and expected reference indicator values for riparian plants in the four reaches of the lower Yellow River

Selection of the most suitable ecological indices

Where possible, the expected reference values of the calculated ecological indices were determined. A subset of indices for which a historical reference was available was selected for inclusion in the river health assessment based on the indices’ simplicity of interpretation, relevance to the main issues of the lower Yellow River, information content, and known sensitivity to human disturbance (Table 4).

Table 4 Ecological indicators selected for inclusion in the lower Yellow River health assessment

H’ was not used in this health assessment because of the lack of reference data from the lower Yellow River. It would have been unreasonable to use the theoretical highest value of H’ (even distribution of abundance among the species present, see Table 2) as a benchmark because it is not known if such values have occurred naturally in the lower Yellow River. Nevertheless, calculated values of H’ are reported here for comparison with future surveys (Table 2).

Total species richness is the most important index for all ecological groups because it represents information on diversity (specific to the sampling methods used in this study). Five indices, including total species richness, were selected for fish (Table 4). With regard to trophic structure, top predators are widely regarded as keystone species in freshwater ecosystems (e.g., Carpenter et al. 2008; Woodward et al. 2008), so piscivore species richness was selected as an index. Migratory species richness was also selected because this can be used to assess the impact of barriers, whether physical, hydraulic, or chemical (e.g., Rolls 2011). Considering that both brackish and freshwater species occurred in reach 4 (estuary), the freshwater species richness was used as an index in this reach. Native species richness was also selected to show the effects of exotic species (cf. Kennard et al. 2005). Three indices were determined for macroinvertebrates: total species richness, number of functional feeding groups (FFG), and total density (cf. Blocksom and Johnson 2009) (Table 4). These represented the total biodiversity, functional biodiversity, and productivity, respectively. Only one index, total species richness, was selected for riparian plants according to available data.

Computing, weighting, and combining indicator scores

River health was assessed for each reach by calculating O/E (observed/expected) metric scores for each indicator. For example, 31 fish species were observed in reach 1, and the expected species in reach 1 was estimated to be 83, so the O/E metric score for species richness was calculated as 31/83 = 0.37. This method gave a metric score for each indicator in the range 0–1, with higher values indicating ecological conditions closer to the reference. The indicator group scores and the combined index score for each reach were calculated according to a procedure that weighted the values of the indices and indicator groups. The weighting procedure (described below) was developed for this study using expert opinions.

Indicator group scores for each reach were first determined for fish, macroinvertebrates, and plants based on the selected individual indices (using a weighted averaging procedure). For each reach, the indicator group scores for fish (F 1–4), macroinvertebrates (M 1–4), and riparian plants (P 1–4) were weighted by perceived importance of the individual indices (Table 5). With regard to fish, total species richness was considered the most important measure of overall health and was given the highest weight. Richness of native species was given the lowest weight because, with only one exotic species expected and recorded in the lower Yellow River, this index gave values close to that of the total species richness. Indices of richness of migratory and piscivore species were given mid-level weights, except in reach 4 (estuary), where migration was less important. With regard to macroinvertebrates, total species richness, number of FFGs, and total density were given weights of 0.4, 0.1, and 0.5, respectively. Both total species richness and number of FFGs represent information on biodiversity, and total density provides information on productivity. In this way, the two types of indices (i.e., biodiversity and productivity) were equally weighted.

Table 5 Equations used to weight the indicator metric scores and calculate the indicator group scores and combined ecological health index scores for each reach of the lower Yellow River

The scores for each river reach were calculated by summing weighted metric scores for the three indicator groups. For each indicator group, the score was weighted by data quality in terms of relative sampling effort (Table 5). Many studies have shown that abundance and diversity continue to increase up to a given sampled area (e.g., Colwell and Coddington 1994; Connor et al. 2000; Gotelli and Colwell 2001; van Gemerden et al. 2005; Hughes et al. 2011). In a comprehensive river health survey, a first step would be to establish the relationship between the value of the chosen index score and the level of sampling effort in order to determine the level of sampling effort at which the index value stabilizes. At this point, the survey data are a represention of the true community diversity rather than just a consequence of the level of sampling effort. Our 2008 biological survey of the lower Yellow River used conventional river health sampling methods; while the efficiency of the level of sampling effort was not evaluated, we believe that, for a river of this size, the effort was unlikely to sample all fish species that were present but was likely to sample fairly high proportions of the macroinvertebrate and plant species. Blocksom and Flotemersch (2005) demonstrated that different gear sample different portions of the benthic macroinvertebrate community, which is why we supplemented the grab sampling with dip net sampling of the shoreline habitats. No information was available on which to base an adjustment of the raw fish data, so an adjustment was made for the relative efficiency of the sampling effort by weighting the fish score lower than those for plants and macroinvertebrates when calculating the combined ecological health index score (Table 5). In reaches 1, 2, and 3, macroinvertebrates and plants were given the same weightings (0.4) (Table 5) because they were both taken with what were considered adequate sampling efforts. Fish were given a lower weighting (0.2) because of the relatively low sampling effort. In reach 4 (estuary), plants were given a higher weighting (0.6), and fish and macroinvertebrates were given lower values (0.2) due to similarly poor sampling efforts for these two groups.

Results

The combined river health index scores for the river reaches (Table 6) indicated a degraded river health condition relative to the reference used here. The ecological health of reach 1 (combined index score = 0.56) was the closest to the reference, followed by reach 2 (0.41), reach 3 (0.37), and reach 4 (0.18).

Table 6 Ecological indicator metric scores for each reach in the lower Yellow River

Reach 4 scored relatively low for each indicator, which is only partly explained by the higher reference values for the estuary. Historically, the estuary/delta (reach 4) had higher biological indicator values than the other reaches, but the 2008 survey found that for many indicators it had the lowest values of the reaches. For example, the values of total species richness for fish, macroinvertebrates, and riparian plants were all lower in reach 4 than in reach 1, yet reach 1 had lower reference values than reach 4.

Reaches 2 and 3 both scored low for the fish indicator group (0.09 and 0.08, respectively), driven mainly by low scores for the migratory species richness index (none present in reach 2, and one present in reach 3). Our sampling of reach 4 represented only a relatively small part of the delta, and more extensive sampling of the wider delta area might reveal higher diversity of biota than recorded in our survey.

Discussion

Current river health condition

This study found that in 2008, with respect to riparian plants, benthic macroinvertebrates, and fish, the lower Yellow River was distant from historical reference conditions, with the overall condition declining in the downstream direction. The main stressors on the lower Yellow River over the past 50–60 years have been artificial stabilization of the banks (mainly in reaches 2, 3, and 4), altered hydrology, sedimentation of the bed, and conversion of wetlands to farmland. By the 1990s, poor water quality was a significant stressor (e.g., Xia et al. 2002, 2005; Fan and Huang 2008; Xie et al. 2009; Yu et al. 2010a, b). Dikes alienate the river channel from its wider floodplain, but this situation has prevailed (to some extent) for centuries.

Hydrological and water quality data suggest that conditions in the early 1980s would have been conducive to good ecological health, but by the late 1990s conditions were unfavorable (Gippel et al. 2012). Since 2000, the intentional operation of Xiaolangdi Dam to achieve environmental benefits has led to noticeable improvements in water quality and hydrological conditions. One exception is a tendency for the drying of freshwater wetlands in reach 1 due to the lowering of the water table in association with channel incision (Zhao et al. 2011). Since the 1980s, wetlands in reaches 1 and 4 have declined in area due to their conversion to farmlands, and the reduced rate of accretion of the delta has resulted in an altered distribution of the main vegetation communities (Gippel et al. 2012). The relatively poor 2008 ecological health scores are therefore consistent with the history of impairment of the lower Yellow River. However, the positive trajectories of hydrology and water quality noted by Gippel et al. (2012) suggest that future improvement in the ecological health scores could be possible. However, the prospect for improvements in fish abundance and diversity, particularly for iconic migratory species, remain uncertain.

Our findings suggest that migratory fish species in the lower Yellow River have been impacted substantively by the anthropogenic modifications to the river system that have occurred since the 1950s and are now rare. Index scores for migratory species for reaches 2 (0.00) and 3 (0.06) were the lowest among all scores determined in this study, and were noticeably low in comparison with those for reaches 1 and 4. Many of the migratory species in the Yellow River have high ecological, social, and/or economic importance. An example is Coilia nasus, one of the three species assigned by Liu et al. (2006) as key species, and once an important commercial species in the lower Yellow River (Cai et al. 1980). Poor water quality and hydrological alteration, particularly the frequent occurrence of long-duration cease-to-flow events that were a feature of the 1990s, restricted the upstream spawning migrations of C. nasus and isolated mature adults from their preferred spawning grounds. Dongpinghu, located on the boundary between reaches 2 and 3 (Fig. 1), was previously the most important spawning ground for C. nasus (Cai et al. 1980), but a regulator now disconnects it from the river channel most of the time. Alteration to salinity dynamics in the estuary associated with low flows and sedimentation of the bed may also have inhibited the upstream spawning migration of adult C. nasus. While a relatively large number of individuals of C. nasus were found in the vicinity of the estuary prior to the year 2000 (e.g., Cai et al. 1980), they are now so rare that local fishers report catching only a few individuals per year (Xiu-Qi Li, Freshwater Fisheries Research Institute of Shandong Province, pers. comm. 2010), and our 2008 survey failed to record this species. Another key migratory species, Coreius septentrionalis, essentially disappeared from the Yellow River by the 1980s. Although Sun (2010) reported that improvements in flow management had seen it return in recent times, the 2008 survey of Ru et al. (2010) failed to record this species anywhere in the basin. The third key species assigned by Liu et al. (2006), Cyprinus carpio, a nonmigratory species, remains an iconic species in the upper section of the lower Yellow River. Even though its abundance is much reduced from historical levels, it was recorded in our 2008 survey.

Determination of reference conditions

It was not possible to establish unimpaired reference conditions, RC(BI), or the minimally disturbed condition (MDC) in the lower Yellow River because (1) the entire length of the lower Yellow River is highly modified from its natural state, (2) it is in many respects a unique section of river, and there are no comparable but less disturbed rivers in China or elsewhere on which to base a reference, and (3) it would be inappropriate to extrapolate basin-wide relationships between ecological variables and an environmental disturbance gradient to the lower Yellow River, because where the Yellow River emerges onto the North China Plain, its scale and its geomorphic and hydraulic character change dramatically; it is likely there would have been a natural step change in the river’s ecological character at that point, independent of human disturbances. In addition, the reality is that ecological health must, at least partly, be traded off with socioeconomic benefits (cf. Pedroli et al. 2002), so the vision for the ecological health of the lower Yellow River is not close to RC(BI) (Liu and Liu 2009). The high level of disturbance throughout the entire lower Yellow River means that the use of the least-disturbed condition (LDC) as a reference might give a false impression of good health. This situation is likely to be common to many of the world’s heavily utilized large lowland rivers.

In this study, we used historical data and expert knowledge to define a reference that would represent maximum potential taxonomic diversity under the historical conditions (HC) of the 1950s for fish, the late 1980s to early 1990s for macroinvertebrates, and the early 2000s for plants. Representing the reference of these three indicator groups at three different points in history was unintentional, and simply reflects data availability. As the level of human disturbances impacting the river generally increased between 1950 and 2000, it can be assumed that the reference for fish is closer to RC(BI) than the reference for macroinvertebrates and plants is. On this basis alone, macroinvertebrate and plant indicators would likely achieve a higher score than fish indicators (incidentally, this was the case; see Table 6).

Considerations for sampling methods and indicator selection

Assessment of the biological health of the lower Yellow River channel and riparian areas was based on conventional field sampling, and our choice of indicators was limited by the type of reference data that was available. Reference species richness was based on detailed surveys and the wide experience of experts, so it represents a level that is not likely to be observed in conventional river health surveys. The importance of sampling effort in large rivers has been investigated and reviewed in many studies (e.g., Bady et al. 2005; Flotemersch et al. 2006; Blocksom et al. 2009; Hughes et al. 2011). Sampling effort is a significant issue, particularly for fish, because practicalities mean that only a small proportion of the total river area is sampled and only a limited time is available for sampling. In the present study, fish scores were probably underestimated due to the limited sampling effort, which might have little effects on the final scores of the river reaches through the weighting process. The current absence of two of the three key fish species of the lower Yellow River is consistent with the overall poor river health scores and the history of impairment of the river. For further river health assessment, it is necessary to conduct site-specific field trials to establish appropriate sampling protocols. Also, rare taxa should be included in bioassessments of large rivers, as this has been considered important in assessments of wadeable streams (Blocksom and Johnson 2009; Weigel and Dimick 2011).

Conclusions and implications for management

In the present study, we assessed the ecosystem health of the lower Yellow River using multiple biological indicators. Historical conditions were determined as the reference conditions based on published literature, expert knowledge, and local knowledge. We presented an approach to the bioassessment of this heavily utilized large river, for which the river ecology is poorly documented. Such an approach is also applicable to other Chinese large rivers where river health assessments have been poorly carried out and are usually based on a single biological indicator (Huang et al. 2010; Zhou et al. 2013). Our results have important implications for further river health assessment as well as river management.

As shown by our results, the ecological health of the lower Yellow River is poor when compared with the historical reference condition. This suggests that more water should be allocated in order to maintain a healthy river ecosystem, although the health of the river has been improved considerably since the 1990s (Liu et al. 2006). Liu and Liu (2009) estimated that in 2030, the Yellow River would require 25–27 × 109 m3 of water per year to maintain the expected standard of ecological health, and 40 × 109 m3 of water per year to achieve the expected economic and social development. Given that the average natural runoff was predicted to be 52 × 109 m3 per year, the shortfall requires a compromise between ecological and social/economic objectives, or “…one step back for both…[rather than]…optimizing them” (Liu and Liu 2009).