Introduction

Climate change is expected to reduce oxygen (O2) levels in the coastal zone, expanding existing hypoxic (O2 < 2 mg L−1) regions and creating new ones, causing substantial harm to coastal ecosystems1,2. Observed increases in coastal hypoxia globally are driven by increasing anthropogenic inputs of excess nutrients, decreased O2 solubility in a warmer world, and more rapid rates of microbial activity3. Accurate projections of estuarine O2 concentrations are necessary for developing management strategies that reduce the negative impacts of increasing hypoxia. However, Earth System Models (ESMs) used to simulate global future climate have limited spatial and temporal resolution, which cannot be improved without substantial computational costs. As a result, ESMs are currently unable to simulate the rapid and critical biogeochemical interactions within coastal environments4 that regulate estuarine hypoxia. While ongoing efforts to improve ESM spatial resolution have demonstrated improved skill in some nearshore regions5,6, these efforts remain few in number, and have not been used to evaluate a full set of emission scenarios7.

The challenge of projecting climate impacts in the nearshore environment may be partially circumvented by forcing coastal ocean models and the watersheds that feed them with downscaled climate projections7. This may be done via the direct application of downscaled ESM forcings to a regional coastal ocean–watershed model over a multi-decadal period (a Continuous experiment) or, in order to reduce computational cost, in discrete intervals (a Time Slice experiment). The potential downside to the Time Slice approach is the assumption that the memory in the watershed and ocean models is much shorter than the time interval of the experiment. The addition of a “delta” that perturbs a historical simulation by the difference between future and historical ESM conditions, referred to here as a Delta experiment, has the advantage of comparing to a single historical simulation and hence has an additional computational advantage over a Time Slice experiment when multiple ESMs are considered. However, the Delta approach, typically implemented by modification of the mean annual cycle, suffers from the assumption that interannual and sub-monthly variability in the climate forcing remains unchanged. This assumption is of particular concern because greenhouse warming is leading to increased variability, especially through increases in extreme precipitation events8. It is important to note that only a Continuous experiment can fully simulate the evolution of changes in model dynamics.

Comparisons of these techniques using statistically downscaled climate projections have been evaluated in multiple terrestrial ecosystems. For example9, showed differing impacts of time slice and delta methods based on the limited hindcast skill of downscaled general circulation models in multiple mountainous U.S. watersheds10. found that a delta approach increased extreme discharge events in the Rhine River watershed more than time slice forcings. In addition11, reported that their application of the delta method underestimated peak flows relative to a time slice approach. To our knowledge, no analysis of this sort has been done in coastal systems, and thus consequences for coastal water quality and marine biogeochemistry are largely unknown.

The Chesapeake Bay is a coastal ecosystem that has been intensively studied and monitored for decades, with robust scientific support for actions needed to increase dissolved oxygen concentrations and achieve water quality restoration goals12 in the face of climate change stressors. Better quantifying climate change impacts on dissolved oxygen is a high priority for Bay regulatory agencies13,14, and numerous previous studies have found that Bay hypoxia will primarily be worsened by increasing atmospheric temperatures15,16,17. However, large uncertainties associated with future hypoxia remain, as these projections are influenced by the choice of ESM, downscaling methodology, and watershed model18, as well as by diverging emission pathways later this century8. Additional methodological uncertainties also remain as most estuarine modeling studies in this region have applied a delta methodology15,16,17,18; until now, no continuous climate change simulations examining impacts on hypoxia over the twenty-first century have been published for the Chesapeake Bay.

In this study, the impact of climate forcing methodology (Fig. 1, Table 1) on mid-twenty-first century hypoxia projections under a business as usual emissions scenario8 is evaluated in a case study for the Chesapeake Bay (Fig. 2). Specifically, a Continuous climate change scenario is compared to two other scenarios, using the Time Slice and Delta methods. Here, the Time Slice experiment is forced by the same atmospheric, oceanic, and terrestrial inputs as the Continuous experiment, but differs from the Continuous experiment in that the years between the two time slices are not simulated. An added perturbation to the baseline climatology is used for the future Delta experiment, meaning that the seasonality and interannual variability of a future scenario is limited by what occurred in the baseline period. Both the Time Slice and Delta experiments therefore require far fewer computational resources to simulate their respective future scenarios, but neither experiment retains effects from the ecosystem memory of intervening years. A comparison of the Continuous and Time Slice experiments will test the assumption of the Time Slice approach, specifically that terrestrial and estuarine model memory has a modest impact on estuarine hypoxia. A comparison of the Continuous and Delta experiments will test the assumption that changes in interannual and submonthly climatic variability have a modest impact on future estuarine hypoxia. Additional Watershed Bypass and Estuary Bypass experiments are also conducted, wherein either the watershed model or the estuarine model is spun up and initialized from future forcings on baseline conditions, rather than using Continuous experiment results at the start of a future period. These Bypass experiments will evaluate whether the watershed or estuarine model memory, respectively, contributes most to differences between the Continuous and Time Slice experiments. The results of these experiments will inform environmental managers and practitioners about the limitations and benefits of using particular techniques for climate impact studies for coastal systems globally.

Figure 1
figure 1

Schematic illustrating the Delta, Continuous, and Time Slice experiments. The Delta experiment corresponds to a 10-year experiment that retains the climatological pattern of baseline conditions in the future, where a climate delta is computed from the difference in the average annual cycle of a baseline and future 30-year period. The Continuous experiment is an uninterrupted simulation from 1980 to 2065. The Time Slice experiment represents a baseline and future simulation forced by the same conditions as in the Continuous experiment, but without any inclusion of years between these periods. The baseline conditions are the same for both the time slice and delta experiments.

Table 1 Model experiments, initial conditions, and boundary conditions.
Figure 2
figure 2

ChesROMS-ECB model grid and bathymetry with river input locations from the terrestrial model (DLEM). The dashed white line corresponds to a transect of the estuary’s main deep channel.

Results

Future changes in precipitation, discharge and nutrient loading

Although total future increases in precipitation were equivalent among all experiments, the intensity distribution of this additional volume varied substantially (Fig. 3a). In the 30-year Continuous experiment, for example, precipitation volume decreased by ~ 2–5% for the bottom 10–30% of daily events (P10-P30) and increased by ~ 4–8% in the highest (P80–P100) events. Time Slice experiment results were similar to those of the 30-year Continuous experiment, differing slightly because of unequal simulation length. The Delta experiment showed a markedly different pattern, with consistently increasing precipitation among all percentile ranges relative to its baseline.

Figure 3
figure 3

Mid-twenty-first century percent changes to average volume for levels of daily (a) precipitation and (b) freshwater discharge, expressed as percentile ranges, where P10 encapsulates the bottom 10% of each experiment’s respective baseline volumes, P20 the lower 10–20% of all daily amounts, etc. The Total set of bars corresponds to the average change among all precipitation (a) and discharge (b) daily levels. Percent change is calculated by computing the difference between a mid-twenty-first century future period and late twentieth century baseline period. The baseline and future periods for the 30-year Continuous experiment correspond to 1981–2010 and 2036–2065, respectively. The baseline and future periods for all other experiments correspond to 1991–2000 and 2046–2055, respectively (Table 1). Changes to precipitation model forcings for the Continuous experiment over 10 years are identical to the Time Slice experiment in (a).

The distribution of future changes in freshwater discharge (Fig. 3b) was quite different from that of precipitation, particularly for the Delta experiment. Specifically, the Delta experiment decreased future discharge in lower intensity events (< P60) despite consistent increases in precipitation during low precipitation events (Fig. 3a). The changes in the distributions of discharge in the Continous and Time Slice experiments were very similar when evaluated over the same 10-year periods, indicating minimal effect of watershed memory on discharge. All three experiments indicated an increase in future precipitation (~ 5–6%; Fig. 3a) and a decrease in future discharge (3–5%; Fig. 3b), indicating that warming-induced increases in evapotranspiration exceeded precipitation increases, similar to what was found for large ensemble of simulations in the same system19.

Future nutrient loadings were influenced by these patterns in future precipitation and discharge. Changes to average nitrate loadings varied substantially, decreasing in the 30-year and 10-year Continuous experiments by 3.8% and 1.8%, respectively, and increasing by 5.7% and 3.5% in the Delta and Time Slice experiments, respectively (Fig. 4). This difference in sign of average nitrate loadings was largely due to the substantial difference in flow-weighted nitrate concentrations, which increased by 2.0%, 4.6%, 9.3%, and 9.1% in the 30-year Continuous, 10-year Continuous, Delta, and Time Slice experiments, respectively. That the Continuous and Time Slice results differ when evaluated over the same 10-year periods indicates an impact of watershed memory on nitrate loading, in contrast to the finding for discharge.

Figure 4
figure 4

Mid-twenty-first century percent changes for freshwater discharge, nitrate loadings, and flow-weighted nitrate concentrations. Percent change is calculated by computing the difference between a mid-twenty-first century future period and late twentieth century baseline period. The baseline and future periods for the 30-year Continuous experiment correspond to 1981–2010 and 2036–2065, respectively. The baseline and future periods for all other experiments correspond to 1991–2000 and 2046–2055, respectively. Error bars correspond to the standard error of the temporal differences between baseline and future periods. Nutrient inputs to the watershed are held constant throughout all simulations.

Future estuarine changes

Future changes to estuarine physical variables based on changes to estuarine, oceanic and atmospheric forcing (see supplementary Table S1) were similar among all experiments (Fig. 5). Individual years in the future Time Slice experiment (2046–2055) for temperature, salinity, and oxygen are essentially identical to those of the Continuous experiment, indicating a minimal impact of model memory. Average surface and bottom temperatures were nearly identical among all experiments; the average increase in surface and bottom temperatures for all experiments were 2.1 °C and 2.0 °C, respectively (Fig. 5a,b). Absolute values of baseline and future salinities were also highly similar for all experiments, resulting in increases of 1.1 units for the 30-year Continuous and Delta experiments, and 1.3 to 1.4 units for the 10-year Continuous and Time Slice experiments (Fig. 5c,d). Additionally, all experiments showed similar decreases in average surface (− 0.35 to − 0.41 mg L−1) and bottom (− 0.44 to − 0.50 mg L−1) O2 levels (Fig. 5e,f). In order to assess watershed and estuarine model memory individually, two additional Time Slice experiments were performed that individually accounted for the ecosystem memory of the terrestrial model and estuarine model, respectively. Simulated changes to temperature, salinity, and oxygen in the additional Bypass experiments were highly similar to the Time Slice experiment, consistent with a minimal impact of model memory, although the Annual Hypoxic Volume (AHV) computed for the Estuary Bypass experiment (AHV = 1339 km3 d) more closely matched the 10-year Continuous experiment (AHV = 1336 km3 d; Table 2) than did the Watershed Bypass experiment (AHV = 1328 km3 d).

Figure 5
figure 5

Projections of temperature (a,b), salinity (c,d), and oxygen concentrations (e,f) averaged over the entire Chesapeake Bay, and averaged over the surface (a,c,e) and bottom (b,d,f) depth levels. Although the spin up for the Continuous simulation was 10 years (starting in 1980) whereas the spin up for the time slice and delta simulations were only 3 years, the baseline simulations used in all experiments are nearly identical.

Table 2 Average annual hypoxia metrics (differences include ± standard errors) for the baseline, Continuous, Delta, and Time Slice experiments.

The progression of average monthly changes to O2 and apparent oxygen utilization (AOU, which is affected by biogeochemical processes only) along the Bay’s mainstem showed an increasingly accelerated seasonal cycle of hypoxia (Fig. 6). 30-year Continuous experiment results showed that O2 decreases were large in January and February (Fig. 6a). Given the small changes in AOU during these months (Fig. 6b), these large O2 decreases must reflect large decreases in solubility, which is most sensitive to temperature during the winter. In this experiment, a larger spring bloom was initiated earlier from March to May. This resulted in greater production that slightly increased surface O2 and greater remineralization that decreased O2 throughout the majority of the rest of the water column. In May, decreasing O2 levels reached the hypoxia threshold (magenta line) and main stem hypoxic volume expanded relative to average baseline conditions (black dotted line) both upwards in the water column by ~ 1.5 m and further south by ~ 7 km (Fig. 6a). From June to August, average O2 concentrations continued to decrease in the upper 10 m in the upper half of the Bay, but the latitudinal extent of hypoxia retreated northwards slightly (Fig. 6a). In August, there was a substantial deficit in nutrients available for primary production, particularly in the southern half of the Bay, leading to large increases in AOU in the upper 5–10 m throughout the majority of the Bay (Fig. 6b). Throughout the summer, biological oxygen demand at the bottom was also substantially reduced (Fig. 6b), increasing O2 in bottom and mid-depth waters throughout the mid-Bay. This region of improving O2 largely dissipates by September and October, and Bay O2 decreases throughout the remainder of the year (Fig. 6a), affected to a smaller extent by changes in production and remineralization (Fig. 6b). The spatial patterns of monthly changes to O2 and AOU were largely similar for the Delta and Time Slice experiments (see supplementary Figs. S1 and S2). Since the future atmospheric and oceanic forcings to the estuary were similar between these experiments, the differences in the magnitude of O2 changes presumably were dependent on the timing and amount of future watershed loadings.

Figure 6
figure 6

Average monthly changes relative to baseline conditions (\(\overline{future} - \overline{baseline}\)) along the mainstem transect (Fig. 2 dashed line) for the 30 year-Continuous experiment: (a) O2 concentrations and (b) apparent oxygen utilization (AOU). Dotted black and solid magenta lines along the mainstem profile represent the hypoxic contour of dissolved oxygen < 2 mg L−1 for baseline and future conditions, respectively.

Substantial differences in O2 concentrations among the experiments also affected projected levels of future hypoxic volume (Fig. 7). This effect was particularly notable for average AHV; the Continuous experiment increased AHV over the 30-year and 10-year periods by 11 ± 6% and 9 ± 9%, respectively (average ± standard error), while the Delta and Time Slice experiments, which used a 10-year averaging period, increased average AHV by 19 ± 9% and 9 ± 9%, respectively (Table 2; Fig. 7a). Increases in daily levels of hypoxic volume that exceeded 10 km3 (equal to ~ 12% of the entire estuary) for all experiments primarily occurred in early summer (Fig. 7b–e), when remineralization of organic matter produced by the spring bloom peaked. The Delta experiment also lengthened the average hypoxia season by 20 ± 6 days, while the 30-year Continuous, 10-year Continuous, and Time Slice experiments only increased the duration by 5 ± 4 days, 6 ± 5 days, and 8 ± 7 days, respectively (Table 2; Fig. 7b–e). In all experiments, the lengthening of Bay hypoxia was primarily due to an earlier start to low-oxygen conditions, with similar timings for hypoxia termination.

Figure 7
figure 7

Simulated changes to (a) levels of annual hypoxic volume (AHV, km3 d) over the entirety of the Chesapeake Bay (and excluding the continental shelf) from the Continuous, Delta, and Time Slice experiments and (be) timing of the average baseline (grey) and future (colored) seasonal cycles of hypoxic volume.

The similarity between hypoxia metrics computed over the 10- and 30-year time periods in the Continuous experiment indicates that the two 10-year averages essentially capture long-term change as well as the difference in the two 30-year averages. The nearly identical results for the 10-year Continuous experiment and Time Slice experiment again reflect the minimal impact of model memory. The Delta experiment stands out as having the largest change in hypoxia metrics. This result, combined with the finding that model memory has a minimal impact on simulated hypoxia, does not support the assumption inherent in the delta approach that changes in sub-monthly and interannual variability in climate forcing have minimal impact on estuarine biogeochemistry.

Discussion

Methodological impacts on coastal hypoxia projections

The three methods used to simulate the impact of climate change on Chesapeake Bay hypoxia revealed differences and similarities. All methods produced similar temperature, salinity, and sea surface height changes (Fig. 5 and Supplementary Table S1). The Continuous and Time Slice experiments also showed highly similar results, when compared using the 10-year periods common to both, indicating very little impact of model memory on the biogeochemical results. Differences in watershed discharge and nitrate loadings, however, were clearly evident among experiments (Fig. 4), with direct consequences for future hypoxic conditions and implications for net ecosystem metabolism and coastal carbon export. The Delta experiment showed a smaller decrease in discharge compared to the other experiments, driven by the distribution of precipitation volumes that fundamentally differed from those of the other experiments for the lower 50% of precipitation events (Fig. 3a). This increase in smaller precipitation events for the Delta experiment likely affected the soil water content and runoff coefficient within the terrestrial model DLEM20.

The relatively simple application of equal increases in precipitation volume among all events as applied here is unlikely to match the projected future precipitation distribution of downscaled ESMs or an ensemble of ESMs21, which is already subject to a great deal of uncertainty due to downscaling methodology and other factors18,22. This mismatch in the daily distribution of precipitation volume is also likely evident in simpler sensitivity studies that increase or decrease total precipitation by a single percentage amount, and is embedded within the change-factor methodology used in the Delta experiment. A more robust way of distributing future precipitation increases when applying the delta approach is likely required to better match continuous projections, but is more complicated and cannot be used to evaluate historical model performance23. Applying future downscaled projections directly, as in the Time Slice and Continuous experiments reported here, provides a more robust way to simulate changes in daily discharge distributions.

Complex interactions among different atmospheric and terrestrial factors including precipitation, humidity, soil moisture, evapotranspiration, and vegetation growth actively influence nitrogen uptake, nitrification, denitrification, and soil leaching in the terrestrial model20. These combined factors are critical to understanding the long-term concentrations of bioavailable nutrients exported to coastal regions (Fig. 8). Despite nearly equal increases in annual precipitation, the temporal distribution of the additional rainfall substantially modifies soil moisture and nitrogen cycling in the watershed and consequently affects nitrogen export to the estuary. Because temperature inputs were functionally equivalent among all experiments, greater nitrate inputs to the estuary in the Delta experiment are a direct consequence of increased soil moisture that affects rates of terrestrial biogeochemical cycling. Since all experiments applied the same constant levels of nutrient inputs to the watershed, differences in nitrate concentrations also demonstrate the impacts of continued nitrate uptake over decadal time scales in the Continuous experiment (Fig. 4), highlighting the potential importance of long-term ecosystem memory within terrestrial models.

Figure 8
figure 8

Schematic showing differences in terrestrial and estuarine biogeochemical responses to changes in distributions of future precipitation between the Continuous experiments (left) and the Delta experiment (right). Digital images and artwork used in this figure were developed by the Integration and Application Network (https://ian.umces.edu/media-library).

Because of greater nitrate loadings, the Delta experiment produced more hypoxia in the mid twenty-first century than the Continuous and Time Slice experiments. Nitrate loadings are a good but incomplete predictor of annual hypoxic volume; observational analyses show that up to half the variability in AHV can be accounted for by nitrate loading24,25,26 and the same is true in ChesROMS-ECB (see Supplementary Fig. S3). Hence, other factors, such as the timing of nutrient delivery, winds, and temperature must play a role. Both the Continuous and Time Slice experiments increase average annual hypoxia by approximately 9% (Table 2; Fig. 7a), and their differences are relatively minor compared to the more than doubled increase found in the Delta experiment. Although increases in average flow-weighted nitrate concentrations were similar for the Delta and Time Slice experiments (Fig. 4), increases in nitrate concentrations in the Delta experiment were concentrated in the spring as opposed to the latter half of the year in the Time Slice experiment, and were likely responsible for the doubled impact of Delta watershed inputs on annual hypoxic volumes. The modest differences that do exist between the Continuous and Time Slice experiments can primarily be attributed to the ecosystem memory present within the watershed model, which primarily affects nitrate concentrations. Annual nitrate loadings in the Time Slice experiment were approximately 7% greater than Continuous experiment inputs (Fig. 4), despite directly applying the same future climate forcings. This change in nitrate loadings slightly increases future annual hypoxic volume (AHV) if the additional nitrate loadings are concentrated in the spring, which is not always true for the Time Slice experiment. Differences in the timing of nitrate export between the Continuous and Time Slice experiments may explain similar estimates of increased AHV (Table 2), despite significantly different responses of nitrate concentrations (Fig. 4). These seasonality impacts also affected hypoxia initiation; the Delta experiment begins the hypoxic season approximately 1.5–2 weeks earlier than the Continuous and Time Slice experiments (Table 2; Fig. 7b–e). While changes in the timing and severity of hypoxic conditions are also likely to affect biogeochemical feedbacks, including sediment diagenesis and secondary production, uncertainties introduced by the methodological approaches here are still likely to be less than the changes in water quality realized through the successful implementation of management actions15,18.

The individual and combined effects of watershed and estuarine model memory showed only minor differences between the Bypass and Time Slice experiments (see supplementary Fig. S4), but were likely affected by a number of model assumptions. Extremely similar results between the Time Slice and Estuary Bypass experiments show the limited ecosystem memory present within ChesROMS-ECB, and emphasize the larger (but still relatively small) contribution of watershed model memory from DLEM that decreases long-term soil nitrate export. However, this phenomenon may not hold true for other watershed models; differences in the representation of terrestrial processes have previously been shown to influence future Chesapeake Bay hypoxia18. The depletion of accumulated soil nitrogen is an important component of these findings that may increasingly tie measures of estuarine water quality to the interannual variability of watershed discharge and undercut anticipated biogeochemical stationarity, or the legacy accumulation of watershed nutrients due to anthropogenic actions27. The lack of estuarine model memory in the Chesapeake Bay is largely consistent with previous research demonstrating relatively short residence times28, linkages between water trends and the interannual variability of watershed discharge and total nitrogen loadings29,30, and high rates of estuarine sediment-nutrient recycling throughout the year31.

This assumption of limited estuarine model memory may not hold in similar marine ecosystems also influenced by elevated nutrient loadings, presenting additional sources of uncertainty. Long-term accumulations of phosphorus in bottom sediments in coastal areas like the Baltic Sea32 and the Gulf of Mexico33 would essentially be held static for climate projections simulating future conditions using a climatic delta, and may not mirror results from multi-decadal simulations. In similar marine systems, differing lengths of time for model spin-up may also magnify discrepancies between Continuous and Time Slice experiments that do not begin at the same time. More complex simulations of sediment dynamics in the Chesapeake Bay may also better represent additional shifting baselines necessary for more realistic future projections34, increasing the currently limited impact of estuarine model memory.

These results demonstrate the benefits and tradeoffs inherent to the Delta, Continuous, and Time Slice experiments, and provide a useful hierarchy for model experiment design. The Delta experiment approach to regional climate projections is relatively simple and computationally inexpensive (relative to a long-term simulation), acts to isolate a climate change signal by maintaining historical patterns of interannual variability, and may provide a more accurate understanding of future climate signals by using realistic past conditions as a baseline. The impact of long-term changes in the mean climate is also relatively easy to capture with a Delta experiment, even if computational expense causes runs to be relatively short (e.g., 10 years) because the deltas themselves can be computed over much longer averaging periods (e.g., 30 years). However, the Delta approach is also incapable of determining long-term ecosystem feedbacks, and may misrepresent biogeochemical processes that are more sensitive to highly variable daily forcings like precipitation (Fig. 4a). The Continuous experiment solves many of these issues, but potentially with much greater computational expense, particularly for very long simulations (greater than a century) and without the potential benefit of representing a more realistic historical period. The Time Slice experiment implemented here more or less faithfully represents the results of the Continuous experiment, although it does so without identifying ecosystem responses that may become more important over time. One potential pitfall of the Time Slice approach is that metrics with very high interannual variability, such as hypoxic volume, may require long time intervals in order to capture the climate signal above the background variability. For example, AHV in Chesapeake Bay is highly variable, with the climate signal difficult to discern, even over the long Continuous experiment (Fig. 7a). The use of a longer time period for a Time Slice experiment may help avoid misattribution of longer-term climate effects with shorter-term variability35,36.

The evidence presented here supports the preferred use of a Continuous simulation over Time Slice simulations if computational expense is not a prohibitive factor. The use of a Time Slice simulation that omits estuarine model memory alone is the preferred alternative to a continuous simulation for rapidly flushed marine systems like the Chesapeake Bay. The Time Slice approach would also seem to make more sense as the time between the baseline and future periods increases and as the signal-to-noise ratio (climate signal vs. interannual variability) in the biogeochemical metric increases. The use of Delta experiments should be applied with appropriate caution when model memory is unlikely to be an issue and when changes in submonthly and interannual variability are expected to modestly impact the biogeochemistry. Altogether, when simulating biogeochemical changes in the coastal environment, researchers should carefully consider the memory and stability of long-term ecosystem responses, the ability of a modeling framework to represent various pathways given underlying stationarity assumptions, and the potential variability of biogeochemical inputs in terrestrial and ocean environments.

Future Chesapeake Bay oxygen projections

Results from the Continuous experiment show that that future Chesapeake Bay hypoxia will expand laterally and vertically early in the summer, but retreat down-estuary in mid-summer as organic matter is remineralized faster. Previous research in the Chesapeake Bay has shown that increasing temperatures play a dominant role in reducing oxygen solubility and increasing remineralization rates15,16,17. The importance of increasing temperatures is also demonstrated here; it was responsible for the rapid increase of production in the late spring and early summer and increased export of organic matter throughout the water column that further reduced bottom oxygen concentrations (Fig. 6). Future late summer oxygen losses were most concentrated in the southern half of the Bay and within 10 m of the surface (Fig. 6), denoting an absence of late summer production that is more limited by intensified grazing. Additionally, the lateral retreat and vertical alignment of the hypoxic zone relative to baseline conditions was largely attributable to earlier increases in remineralization of semilabile dissolved organic nitrogen that exhausted the supply of autochthonous organic matter that typically acts as an oxygen sink throughout the summer (Fig. 6). This process largely agrees with recently observed dynamics reported by others37,38, that identified a “speeding-up” of the hypoxic seasonal cycle and was partially attributed to observed estuarine warming.

Long-term changes to Bay oxygen levels and hypoxia shown in the Continuous experiment highlight the importance of better understanding and constraining the evolution of ecosystem dynamics when projecting realistic future conditions. The estuarine model used here represents a simplified version of producer and consumer dynamics, known to also be influenced by regular hypoxic conditions39,40. Introducing additional model state variables to capture these dynamics is unlikely to reduce current or future model uncertainty41. But an examination of temperature-dependent functions for potential shifts in dominant phytoplankton and zooplankton groups, due to both reduced nutrient loadings42,43,44 and species-optimal thermal acclimation45, may provide additional insights into the possibility of the Bay undergoing a fundamental “regime shift”46.

Refining regional hypoxia projections

Previous research on projected Chesapeake Bay climate impacts has found that increasing temperatures and sea surface height are likely to increase estuary temperatures and salinity16,17,47,48,49. There is agreement among multiple studies that increasing temperatures will more substantially decrease Bay dissolved oxygen levels by reducing solubility and increasing biogeochemical rates15,16,17. Sea level rise impacts on dissolved oxygen are more mixed overall50 and can be modified by enhanced estuarine circulation15, increased stratification strength16,49, tidal responses51, and enhanced production due to an increase in shallow areas52. Our findings are largely in agreement with these previous results, and reinforce that the greatest source of remaining uncertainty lies in characterizing the Bay’s response to changing nutrients from the watershed, which are affected by the choices of Earth System Model, downscaling methodology, and watershed model18, in addition to the distribution of future increases in precipitation events. Besides producing differing estimates of hypoxia, the relative discrepancy between experimental approaches is likely to substantially influence other biogeochemical processes such as carbon export, which has been previously shown to change direction due to the influence of increased Chesapeake Bay net primary production53. Besides influencing future rates of Bay acidification54,55,56, methodological uncertainties identified here may affect whether the estuary acts more or less as a sink of atmospheric CO2.

A multi-pronged effort among research institutions could also narrow the range of likely hypoxia futures by evaluating common metrics to find particular points of agreement and disagreement. Applying long-term projections to a suite of estuarine biogeochemical models will better constrain outcome uncertainty57, particularly with respect to more complex representations of sediment diagenesis58 and wetland interactions59. A combined effort studying different biogeochemical responses should utilize direct climate projections in such a regional application, avoiding the delta method that can consequentially alter watershed export of discharge and nutrients and may be magnified with other ESMs. Such an effort has been ongoing over the past decade in the Baltic Sea60, another large coastal area seeking to decrease nutrient loadings in a multi-jurisdictional framework. Future work may also benefit from simplified data-based modeling approaches and metamodels to rapidly simulate a larger distribution of future changes to physical and biogeochemical dynamics in the Chesapeake Bay61,62,63.

Serious challenges remain in narrowing the range of projected water quality outcomes in the Chesapeake Bay, despite major advancements in the representation of the linked terrestrial–coastal ecosystem in recent decades64. Many potential negative water quality consequences for a warmer, more stratified estuary can be overcome by meeting terrestrial nutrient reduction targets12, which have been repeatedly shown to offer a pathway to improved oxygen levels despite multiple climate change pressures15,18. The influence of watershed sediment export due to more extreme precipitation events and subsequent resuspension within the estuary will also influence biogeochemical cycling34,65,66, but linkages between climate projections for precipitation and wind events and their subsequent impacts is limited67. An improved representation of changing phytoplankton dynamics can also help better determine how nutrient recycling may vary in the Bay’s bottom waters, and better quantify the potential for future untested legacy effects of eutrophication. Generating a range of consistent estimates at the base of the coastal food web will continue to pay dividends when projecting impacts on higher trophic level species and communities that rely on them and will help refine scientific tools to better prepare for unanticipated ecosystem changes.

Conclusions

This study investigated differences in future hypoxia projected by a regional model of the Chesapeake Bay using three different climate scenario methodological approaches: a Continuous simulation spanning 1980–2065, a Delta simulation with a change in climatic conditions applied to a 1990s baseline, and a future Time Slice simulation representative of mid-twenty-first century conditions compared to the same 1990s baseline. Despite nearly equal changes in estuarine physical conditions (i.e., temperature and salinity), the Delta method increased average hypoxia by 19%, nearly twice the amount projected by the Continuous (11%) and Time Slice (9%) methods. The greater increase in hypoxia is primarily driven by increases in nitrate loadings in the Delta experiment, which are themselves due to increasing watershed nitrate concentrations and a lesser decrease in annual flow. Additionally, results from the Continuous and Time Slice simulations show that hypoxic conditions initiate 2–3 weeks earlier than baseline conditions but will also exhaust nutrients more rapidly, leading to equivalent or slightly lower levels of late-summer hypoxia.

Based on these conclusions, we can provide several recommendations for future research directions. When there is relatively little ecosystem memory, the Time Slice method is a reliable alternative to the Continuous method for climate projections of coastal hypoxia. Aforementioned differences in watershed nitrate concentrations found in the Delta experiment warrant caution when using this approach, particularly when simulating changes to precipitation intensity, duration, and frequency that affect terrestrial biogeochemistry. Additionally, the methodological approach should be chosen carefully based on a regional model’s ability to account for the internal ecosystem memory of biogeochemical dynamics. Earlier increases in hypoxic conditions and elevated levels of remineralization that limit secondary production in the summer previously reported and reproduced here may vary with respect to nutrient reduction efforts in the watershed. Simulated responses to biogeochemical changes in future conditions are dependent upon a multitude of implicit factors and potential feedbacks, and researchers should continue to investigate underlying assumptions and points of uncertainty in the experimental design that may result in significant differences for projected hypoxia.

Methods

Modeling framework

Estuarine model

This work applied a three-dimensional, fully coupled hydrodynamic–biogeochemical model, with 20 vertical levels and approximately 1 km horizontal resolution, to simulate future changes in the Chesapeake Bay18,53,68 (Fig. 2). The hydrodynamic model uses the Regional Ocean Modeling System (ROMS;69) implemented in the Chesapeake Bay (ChesROMS;70) with an Estuarine Carbon Biogeochemistry (ECB) component53,71. The coupled model (ChesROMS-ECB) explicitly represents estuarine nitrogen and carbon processes and includes single phytoplankton and zooplankton state variables and two detrital size classes. Parameters defining the maximum growth rate of phytoplankton and the critical bottom shear stress were the same as those used in18.

Terrestrial model

In this study, ChesROMS-ECB received daily estimates of watershed discharge, nitrogen loading, and carbon loading from the Dynamic Land Ecosystem Model (DLEM;20,72,73) at ten river input points around the estuary (Fig. 2). DLEM is a process-based terrestrial ecosystem model that is used to simulate fluxes of water, carbon, and nitrogen while accounting for climate change and land-use change74. DLEM has previously performed well under observed conditions when applied to the Chesapeake Bay watershed18,68.

Climate inputs

Simulations of historical conditions and mid-twenty-first century projected change to the Chesapeake Bay were conducted using Continuous, Delta, and Time Slice methodologies (Fig. 1; Table 1). In all experiments, past and future climate forcings were derived from the IPSL-CM5B-LR model (r1i1p1 ensemble member;75), which is part of the 5th Phase of the Coupled Model Intercomparison Project (CMIP5;76). This ESM was previously identified as the centroid among multiple downscaled ESMs when considering changes to atmospheric precipitation and temperature over the Chesapeake Bay watershed18. A future climate scenario representative of continued increases in greenhouse gas emissions was selected for this analysis: Representative Concentration Pathway (RCP) 8.577. This scenario increases average global radiative forcing by 8.5 W m−2 in 2100 and by approximately 4.0 W m−2 in 2050 relative to pre-industrial conditions in the selected ESM77,78,79.

The atmospheric output from the selected ESM was statistically downscaled and bias corrected using Multivariate Adapted Constructed Analogs (MACA;80) and applied to both the terrestrial and estuarine models. Daily downscaled estimates of temperature, precipitation, and net shortwave radiation were applied directly to DLEM, and the remaining atmospheric forcings were calculated internally. In contrast, the estuarine model was forced by daily MACA-downscaled inputs of atmospheric temperature, relative humidity, air pressure, net shortwave, wind speed and direction, and precipitation, as well as downwelling longwave radiation compute from selected MACA variables81. Local diurnal cycles were imposed on downscaled estimates of daily shortwave radiation inputs internally within the estuarine model.

Ocean forcings included both physical and biogeochemical ocean boundary conditions along the open boundary of the estuarine model grid. These forcings were combined from equations for regional sea level rise trends as well as ESM outputs representative of the historical period (1865–2005) and the future climate scenario period (2006–2100). Specifically, ocean temperature was derived from the oceanic component of the ESM at the grid cell nearest to the estuarine model boundary, averaged over the upper 40 m (approximately equivalent to the depth of the mouth of Chesapeake Bay), and then bias-corrected using a quadratic relationship derived from a comparison with the World Ocean Database82 observations. Salinity, carbon, and nitrogen concentrations for the historical period were based on information from the World Ocean Database and taken from83. O2 concentrations were set to saturation conditions as in83. Ocean boundary forcings are prescribed at monthly intervals as in previous work54 for physical and biogeochemical variables. Sea surface height forcings representative of 1980 conditions at the model’s open boundary were derived from hourly coastal observations and the Advanced Circulation model84 as in previous work18. Long-term changes in sea surface height were added to observed 1980s levels using the equation provided by85, which is based on defined parameters for long-term projections at the Norfolk, VA tidal gage.

Experimental design

Continuous experiment

The Continuous experiment simulated the daily variability and long-term changes in the evolution of climate model impacts over the period 1980–2065 using daily, bias-corrected ESM outputs, together with watershed and coastal ocean inputs as previously described (Table 1). The terrestrial model simulated dynamic historical conditions from 1900 to 1980 using observed climate (from PRISM;86), land use, and nutrient inputs73, and then held land use and nutrient inputs constant from 1980 to 2065. The estuarine model was spun-up for three years prior to the start of the full 86-year simulation using ESM and watershed forcings from 1980 to 1983. An additional long-term estuarine model simulation was completed over the same time period as the Continuous experiment (1980–2065) but applied the same atmospheric, oceanic, and watershed forcings representative of 1980 conditions for each year, following an approach described by87. This additional simulation quantified model drift over the simulation period and revealed no significant trend in hypoxia (a decrease of 0.6%); interannual variability in annual hypoxic volume was ~ 1% (16.1 km3 d) of the long-term average (1295.3 km3 d). Therefore, model drift present within the long-term simulation of the estuarine model produced negligible changes in biogeochemical outcomes.

Time slice experiment

A second main experiment directly applied the same daily ESM forcings to a 10-year baseline period (1991–2000) and a 10-year future period (2046–2055) as in the Continuous experiment, without simulating the intervening years in the estuarine or watershed models (Fig. 1; Table 1). Watershed model conditions were initialized using the same approach as the Continuous experiment. Also like the Continuous experiment, the estuarine model was spun-up for three years prior to the start of the 10-year simulation using ESM and watershed forcings representative of baseline and future conditions. Therefore, in this experiment, neither the terrestrial model nor the estuarine model retained any ecosystem memory of conditions leading up to the start of the future period, while the atmospheric and oceanic forcings for the baseline and future periods were the same as those used in the Continuous experiment. An additional two closely related experiments, referred to as Watershed Bypass and Estuary Bypass, individually accounted for the ecosystem memory of the terrestrial model and estuarine model, respectively. The Watershed Bypass experiment used future watershed model starting conditions equivalent to those in the combined Time Slice experiment, but applied future estuarine start conditions that matched the Continuous experiment. Conversely, the Estuary Bypass experiment used future estuarine model starting conditions that matched the combined Time Slice experiment while retaining future watershed model starting conditions from the Continuous experiment.

Delta experiment

The Delta experiment (Fig. 1; Table 1) simulated a baseline period (1991–2000) that was identical to the Time Slice baseline, and a future period representative of mid-century conditions (2046–2055). In contrast to the typical implementation of the Delta approach, which uses observed climate forcing for the historical simulation, here we use climate forcing solely from the ESM. By doing so, we facilitate a straightforward comparison of the Continuous, Time Slice, and Delta experiments. For application to the watershed model, climatic changes in atmospheric forcings were calculated from the mean annual cycles of a 30-year reference period (1981–2010) and a future mid-century period (2036–2065). For all variables except precipitation, the difference in the mean annual cycles was computed but for precipitation, the monthly fractional change was computed and applied instead of using the absolute difference. To determine the 2046–2055 atmospheric forcings, these changes were applied to the 1991–2000 forcings. In this way, the baseline period of the Delta experiment was the same as in the Continuous and Time Slice experiments (1991–2000), and the future period retained the same interannual and sub–monthly variability as in the baseline period. Like the spin-up period for the Continuous experiment, DLEM was initialized in 1900 using PRISM atmospheric forcings86, and nutrient input levels were held constant from 1980 onwards. However, PRISM is continuously used to force DLEM until 1991, the point when ESM climate baseline and delta conditions are used to force the watershed model. Initial conditions for the estuarine model also included a three-year spin–up period using baseline (1991–1993) ESM and DLEM forcings.

Method comparison

Differences introduced by the methodological approaches described above were quantified by comparing long-term projections of physical and biogeochemical change within the estuary. Estimates of annual hypoxic volume (AHV, km3 d), which integrate daily hypoxic volume (HV, km3) over a full year88, were calculated by summing the volume of model grid cells containing daily average oxygen concentrations below a specified threshold (< 2 mg L−1) within the Chesapeake Bay and excluding the continental shelf. Average percent and absolute changes in these metrics were evaluated over 10 years for the Continuous, Delta, and Time Slice experiments, comparing the 1991–2000 baseline against the future period of 2046–2055. Comparing the Continous and Time Slice experiments (as well as the associated Bypass experiments) over the 10-year time periods allows for a direct assessment of the impacts of terrestrial and estuarine model memory. A period of 30 years was also used for the Continuous experiment, with baseline and future periods spanning the years 1981–2010 and 2036–2065, respectively. This longer comparison period for the Continuous experiment is needed for a comparison to the Delta experiment, because this period is reflective of the fact that the change in climatic forcing applied in the Delta experiment is derived by averaging over these same two 30-year spans of ESM output to minimize the effect of decadal oscillations. Comparing the 10- and 30-year periods for the Continuous experiment allows for an assessement of the representativeness of a single decade in capturing long-term climate change.