Introduction

Hurricanes and severe storms, such as Hurricane Harvey in 2017, have significant impacts on water quality in estuarine systems that have yet to be fully elucidated. Due to safety and access challenges, most characterization efforts occur after the event, with delays of up to several weeks in some cases (Adams et al. 2007; Amaral-Zettler et al. 2008; Gong et al. 2007; Hagy et al. 2006; Huang et al. 2013; Mallin and Corbett 2006; Pardue et al. 2005). Storm surge, for example, can drastically, albeit temporarily, increase bay and estuary salinities (Huang et al. 2013)), while severe rain events may have the opposite effect (Filippino et al. 2017).

While ambient changes in pollutant sources and changes in hydrologic and hydrodynamic conditions within water bodies influence water quality on a continuous basis, the magnitude of change is typically within historical ranges. Hurricanes and severe weather events, on the other hand, because of their acute episodic nature, can have severe impacts on water quantity, quality, and location of water for the duration of the storm that are outside historically observed norms. While there is evidence that water quality recovers relatively quickly from severe events (Chen et al. 2017; Greening et al. 2006; Hagy et al. 2006; Huang et al. 2011), other studies indicate long-term impacts (Beaver et al. 2013)(Wetz and Yoskowitz 2013).

Record rainfall during hurricanes brings tremendous amounts of suspended sediment into near-coastal systems that are often contaminated with a variety of pollutants such as heavy metals and organics (Romanok et al. 2016; Warren et al. 2012). In addition to the pollutant loads they may carry, suspended sediment blocks light from penetrating through the water column, impacting aquatic life and increasing light scattering that can increase water temperatures. When solids settle, they may cover eggs or oyster beds. The upper tolerance of total suspended solids (TSS) for most species is 80–100 mg/L, while some bottom invertebrates can be harmed by TSS concentrations as low as 10–15 mg/L (Griffiths and Walton 1978; Jha and Swietlik 2003). Sediments from urban systems often carry pollutant loads ranging from heavy metals to chlorinated compounds (Rossi et al. 2013; Sikorska et al. 2015).

For instance, Hurricane Katrina had a minor, and relatively short-term impact on water quality in the Gulf of Mexico for both conventional water quality measures of algae, nutrients, salinity, and TSS as well as trace metals in sediments (Smith et al. 2009; Warren et al. 2012). Bays and estuaries also appear to be more resilient to large coastal storms than other extreme climatic events such as drought (Smith and Caffrey 2009; Wetz and Yoskowitz 2013). The 2017 hurricane season may change perceptions regarding system recovery, however, particularly in the case of Hurricane Harvey. This storm was not accompanied by surge along the Houston-Galveston coast but produced unforeseen amounts of rainfall for several days and extreme amounts of suspended sediment that caused the region’s rivers to be referred to as “rivers of brown.”

Due to dangerous conditions during and immediately after extreme storms, conventional water quality and sediment measurements are nearly impossible. Additionally, water quality monitoring requires resource-intensive field events and produces relatively sparse datasets. Most studies of water quality storm impacts look at conditions weeks or months after the event has passed (Chen et al. 2017; Du et al. 2019; Filippino et al. 2017; Romanok et al. 2016). Satellite imagery offers an increasingly valuable resource for inland and coastal water quality characterization although it has limited scope for narrow/smaller waters where low spatial resolutions and obstacles, such as overhanging trees, and clouds impede the satellite view of the water surface (Matthews 2011; Mouw et al. 2015; Tyler et al. 2016). Remotely sensed satellite imagery can provide information immediately after a large storm, before field teams are able to mobilize, thus, providing valuable insights not otherwise available with conventional water quality assessments. With spatial resolutions ranging from 30 m to as little as 0.3 m, various satellites such as Landsat and Sentinel-2 satisfy the minimum spatial, temporal, and spectral resolution needed for inland and coastal water quality characterization (Tyler et al. 2016). Satellite data can be used to capture the aforementioned changes in natural water systems due to hurricanes and extreme weather events since remote sensing reflectance (Rrs) data is information-rich and has a satisfactory spatial and temporal resolution.

Certain water quality constituents, e.g., TSS and chlorophyll-a, are considered optically active, which means that changes in their concentrations will produce proportional changes in reflectance at specified wavelengths allowing their concentrations to be quantified (Morel and Prieur 1977; Werdell and Bailey 2005). Improved spatial and radiometric resolution in recent years has made this technology more applicable to inland water quality applications. The use of remote sensing to quantify the concentrations of suspended sediment in surface waters has been in place for decades (Curran and Novo 1988; Gitelson et al. 1993; Lee et al. 2002; Mertes et al. 1993; Ritchie et al. 2003). The literature is replete with studies that have utilized reflectance data for water quality characterization under non-extreme conditions (Brando and Dekker 2003; Caballero et al. 2014; Gurlin et al. 2011; Hamidi et al. 2017; Sakuno et al. 2014; Zheng et al. 2015). However, none of these studies specifically applied remote sensing tools to investigate the water quality impacts of extreme climatic events. Additionally, the timing of a storm’s passage and satellite image capture is unpredictable. Hurricane Harvey happened to coincide well with several Sentinel-2 images. The lack of existing Sentinel-2 algorithms, in addition to the previously discussed limitations in existing TSS algorithms, prompted the development of a new algorithm which was assessed along with some previously developed for other multi-spectral satellites (Caballero et al. 2014; D’Sa et al. 2007; Nechad et al. 2010; Pahlevan et al. 2017; Zheng et al. 2015). This research demonstrates the viability of using remote sensing reflectance to elucidate the water quality impacts of extreme climatic events (e.g., hurricanes) on surface water in near real-time. This is a novel approach and to the best of the knowledge of the authors has not been attempted previously for inland waterways and coastal estuarine systems such as the Houston Ship Channel and Galveston Bay System (HSC-GBS) that are studied here post Hurricane Harvey.

Methods

Study area and event

Figure 1 shows the study area that was the focus of the research: the Trinity and San Jacinto Rivers are the primary water bodies that drain to the Galveston Bay (GB) estuary system and discharge into the Gulf of Mexico. The GB system encompasses a variety of waterbody types including relatively smaller freshwater and tidal streams, large fresh and tidal rivers, several lakes, and a shallow turbid estuary/bay system. Flat topography, low slopes, and high clay soil content result in relatively low infiltration and a propensity for flooding. Developed land that covers much of the study area today is due to the expansion of the City of Houston’s, which has led to an increase in spatial extent of impervious cover.

Fig. 1
figure 1

Study area map including land use, spatial coverage of Sentinel 2 satellite, and water quality monitoring (WQM) stations

Between August 25th and 31st of 2017, Hurricane Harvey released approximately 63 to 127 cm (25–50 in.) of rain on the Trinity and San Jacinto River Basins and the Galveston Bay estuary. The record rainfall generated record flows containing significant amounts of suspended solids. Flow data were obtained from the USGS, while the TSS historical data record was obtained from the Texas Commission on Environmental Quality (TCEQ) database (TCEQ 2018).

Data acquisition and processing

The European Space Agency’s (ESA) Sentinel-2 is a multi-spectral satellite with 10-m pixel resolution in the visual and near infrared (NIR) bands, and 2–10 day re-visit time depending on latitude. The Sentinel-2 satellite captured nearly cloud-free images at critical dates in the storm’s path: three days before storm arrival (8/22/2017), and one day after its passing (9/1/2017). Figure 2 provides a true-color image of Galveston Bay, captured by Sentinel-2, for each of the aforementioned dates as well as an image captured on 9/11/2017. The Sentinel-2 mission launched in the summer of 2015, so there was a limited record for satellite images captured coincidently with water quality samples. The spectral bands of interest were visible (RGB) and NIR bands (832–852 nm). Select water quality monitoring (WQM) stations were identified within the Galveston Bay (GB) estuary based on the availability of data and their representativeness across the waterbody (Fig. 1).

Fig. 2
figure 2

Sentinel-2 true-color images of Galveston Bay before and after Hurricane Harvey including 8/22/17, 9/1/17, 9/11/17, and 10/1/17

At the time of the study, Sentinel-2 Level 1C imagery was only available in top of the atmosphere (TOA) reflectance format, without any atmospheric correction. Sentinel images were viewed and processed in the Sentinel Application Platform (SNAP), an open source software and user interface provided by the ESA. The Sen2Cor processor, also provided by the ESA (ESA 2017), was applied to perform the atmospheric-, terrain, and cirrus correction of Top-Of- Atmosphere Level 1C input data. Sen2Cor produces Bottom-Of-Atmosphere (Level 2A), optionally terrain- and cirrus corrected reflectance images as well as Scene Classification Maps. More details on the atmospheric correction algorithm are found in (ESA 2017). Standard assumptions were used, including the “AUTO” setting, which accounts for seasonality and aerosol type. The Sen2Cor classification map, which identifies land, water, ice, clouds, and cloud cover, was applied to isolate water pixels, since white caps, high wind speeds, and foam formation are infrequent in Galveston Bay, and visual inspection confirmed that pixel classification was of sufficient quality. Galveston Bay is a characteristically turbid waterbody; however, water column depth and transparency were compared to account for the possible influence of bottom reflectance. The ratio between transparency (Secchi disc, m) to depth (m) in coincident samples ranged from 4.5 to 41.4%. This indicates that the transparency of the water column never reaches even 50% of the water depth, so the influence of bottom reflectance is assumed to be negligible.

There have been several hundred TSS samples collected (EPA Method 160.2) within the GB estuary since the Sentinel-2 satellite launch. The mean TSS concentration for the developed dataset was 30 mg/L, with a median concentration of 21 and a range of 3–137 mg/L. Coincident TSS samples and Rrs datasets were matched and processed. An allowance of up to three days between sample collection and image capture resulted in a total of 46 data pairs (see Fig. 1) from eight Sentinel-2 scenes. Using this approach, 64% of data pairs were collected within one day or less. Data pairs collected between two and three days were only included if no rainfall was recorded in the preceding five days, so TSS concentration changes were assumed to be minimal. High chlorophyll-a concentrations can interfere with the TSS spectral signal (Reisinger et al. 2017). Where coincident chlorophyll-a samples were available, data pairs with chlorophyll-a concentrations greater than 30 μg/L were removed (n = 2). Since cloud shadow identification is not as robust for the Sentinel-2 mission, the Sentinel-2 images were visually inspected at the locations of each data pair to remove any points containing cloud shadow or other fine-scale variability such as ships and barges.

Sentinel-2 regression development

The final dataset included eight Sentinel-2 images, 32 TSS samples for calibration, and 12 for validation collected (see Fig. 1). Figure 3 provides a schematic of data analysis and regression development for the satellite TSS modeling. After isolating water pixels, Rrs values for the visible and NIR bands (Bands 2–4 and 8) were extracted from within a 20-m buffer around each WQM station. Image processing and extraction of Rrs values were all completed using the ArcGIS model builder and ArcPy extension. Summary statistics for each pixel window included water pixel count, mean Rrs, median Rrs, minimum and maximum Rrs, and standard deviation. Pixels that lie along the boundary of the buffer were included if at least half of the pixel’s area was within the 20 m buffer.

Fig. 3
figure 3

Workflow depicting data processing and regression development for remotely sensed surface reflectance water quality characterization

Band ratios and other band relationships were selected to be consistent with literature values and which most frequently show higher correlations with TSS concentration (Caballero et al. 2014; Candiani et al. 2005; D’Sa et al. 2007; Matthews 2011; Nechad et al. 2010; Zheng et al. 2015). Table 1 lists the spectral variables that were evaluated in developing the regression. The resulting matched TSS- Rrs dataset (2016–2017) was semi-randomly split into calibration (n = 32) and validation (n = 12) stages, ensuring that both parts were representative over time, space, season, and concentration (Fig. 1). Lastly, 11 linear regressions were analyzed with the goal of developing a simple one- or two-band equation. The most correlated spectral variables (see Table 2) were included in the linear regression to find the optimal coefficients for

$$ WQ=\sum \limits_{i=1}^n{C}_i{X}_i+{C}_o $$
(1)
Table 1 Spectral variables evaluated for correlation and inclusion in regression equations
Table 2 Correlations between TSS and spectral variables as measured by Sentinel-2

Where WQ is the concentration of the water quality variable of interest (TSS in this case), and Ci is the coefficient for the corresponding spectral variable; Xi is the spectral variable, listed in Table 2, and Co is the regression intercept. Correlations between TSS and band ratios were lower than the other spectral variables, so these were not included in the regression analysis. The regression was optimized to minimize the root mean square error (RMSE) and maximize R2 within Microsoft Excel with the goal of developing a simple one- or two-band equation. The RMSE and coefficient of determination were chosen to evaluate the performance of the developed models in predicting the error magnitude and variance, respectively. The median absolute error and the ratio between mean and median were also calculated to address potential non-Gaussian behavior. Mean absolute error and the slope of the regression line (with 1 being ideal) were calculated for the validation datasets. The model with the best overall performance in both calibration and validation was selected.

In addition to the new algorithm investigated, previously developed algorithms were applied to the spectral dataset including the red/green band ratio in D’Sa et al. 2007, the NIR band in Caballero et al. 2014 and Zheng et al. 2015, and three algorithms using the red, green, and NIR bands from Nechad et al. 2010 and Pahlevan et al. 2017. The aforementioned evaluation metrics were used to compare the performance of these algorithms with the newly developed ones in this study.

Sentinel-2 regression model application

Out of the developed models based on Eq. 1, the one with the best performance in evaluation metrics was used to model TSS concentrations for four critical dates that had images captured by Sentinel-2:

  • 8/22/17: Baseline condition, three days before storm arrival

  • 9/1/17: Immediate effects, one day after storm passing

  • 9/11/17: Residual effects, 11 days after storm passing

  • 10/1/17: Extent of recovery, 31 days after storm passing

The application of the regression equations to the four dates allowed for a relatively comprehensive picture of Harvey’s impacts, from a pre-storm baseline to post-event recoveries. As before, the regression equation was only applied to pixels identified as containing water according to the Sen2Cor mask unique to each scene. Figure 4 shows the location of select flow gages in the GB system and displays the storm hydrograph or water surface elevation for tidally influenced gages. The points illustrate the hydrologic condition in the vicinity of the gages at the time of Sentinel-2 image capture for dates the regression was applied. As can be seen in Fig. 4, flows during the pre-storm scene were at near zero or baseflow levels, representing an ideal reference condition. For the image captured on 9/1/17, flows were either at their peak level or on the falling limb of the hydrograph, indicating that the scene is indicative of flooding conditions. By the third date, flows had returned to pre-storm levels in most locations, and did not rise again before the last date.

Fig. 4
figure 4

Hydrologic context for Hurricane Harvey. A hydrograph or water surface elevation (WSE) plot is shown for each gauge in the study area (triangles). The points on each plot represent the timing of the Sentinel-2 satellite pass (8/22/17, 9/01/17, 9/11/17, and 10/01/17)

Results and discussion

Figure 5 shows a sample image processing flow for a typical Sentinel-2 scene and also shows a sample of the 20-m buffer applied for the pixels surrounding each station of interest. Table 2 provides the correlations between each spectral variable and TSS concentration. Rrs and the square of Rrs produced the highest correlations. The red and NIR bands, and their squares, showed the highest correlations. This is consistent with existing literature due to increased scattering from particulate matter in the red and NIR (Caballero et al. 2014; Matthews 2011; Tyler et al. 2016). Though correlations with the blue band were high, less variability was observed in the band.

Fig. 5
figure 5

Image processing workflow for a true-color Sentinel-2 scene captured on February 2nd, 2016 with (a) TOA reflectance, (b) BOA reflectance from the Sen2Cor processor, (c) Sen2Cor produced water mask, (d) WQM stations overlaid on water pixels, and (e) a sample station 20-m buffer for mean pixel extraction

Table 3 describes the resulting variables and coefficients produced for several TSS regression models developed based on Eq. 1. As noted, the blue band did not perform as well during the linear regression, and the resulting regression equation utilized the red and NIR bands. The red and NIR coefficients had similar magnitudes, while the square of the NIR band was somewhat less influential.

Table 3 TSS-Sentinel 2 regression model equation coefficients and variables

Table 4 provides the performance metrics for each model. Model 1 was selected as the final model because it met the goal of a simple 1 to 2 band model and its superior performance in the validation dataset. Although model 2 has slightly lower mean and median absolute errors, model 1 has a slope much closer to one. The negative bias in model 1 indicates bias towards underprediction while the positive value in model 2 indicates overprediction. The mean-to-median ratio for all models was close to one. In selecting model 1, the one-to-one line was prioritized over the error metrics due to the importance of preventing over- or underestimating of the large TSS concentrations that occur during extreme events.

Table 4 TSS-Sentinel 2 regression model equation performance

All of the previously developed models showed high correlations but large errors for the other evaluation metrics (except (D’Sa et al. 2007) and the green band in (Nechad et al. 2010)). The green band (Nechad et al. 2010) showed comparable values with model 1; however, the slope of the regression line was substantially lower for the validation dataset. Additionally, model 1 performed better than models reported in similar studies with an RMSE of 2.22 mg/L compared to RMSE values ranging from 6 to 106 mg/L (Nechad et al. (2010) reported 6–7 mg/L, Zheng et al. (2015) 6–12, and Caballero et al. (2014) 54–106). To compare the existing and developed models, the mean absolute percentage error (MAPE) was also calculated. Model 1 has a MAPE of 27.65% and 47.23%, respectively for calibration and validation, while similar studies reported values of 11–88% (Caballero et al. 2014; Nechad et al. 2010; Zheng et al. 2015). Figure 6 shows plots for performance for the calibration and validation datasets for the final model selected based on the validation metrics. The RMSE of the selected model was 2.22 mg/L for the calibration and the correlation was 0.85. The mean and median absolute errors for the validation dataset were 9.69 and 9.18 mg/L, respectively, while the slope was 0.96.

Fig. 6
figure 6

Plots and performance of the selected model (model 1 from Table 4) for the calibration (top) and validation (bottom) datasets

Figure 7 shows the results of the Sentinel-2 regression modeling for the four scenes of interest for extreme weather impacts. The median TSS concentration for each day was 7, 43, 21, and 10 mg/L, respectively. Before the storm, TSS concentrations were low across the bay with little to no rainfall in the previous two weeks. Concentrations over 125 mg/L can be seen in the lower portion of the bay 1-day after Harvey (09/01/17). Almost no parts of GB were less than 25 mg/L, and most had at least 50 mg/L. These values are well within the range that would have negative impacts to aquatic life and the relatively high TSS concentrations in the lower parts of the bay are well above the upper tolerance of many aquatic species (> 80 mg/L) (Griffiths and Walton 1978; Jha and Swietlik 2003). The largest concentrations appeared to follow the outflow path from the SJR that had the highest discharge compared with other bay inputs. As can be seen from Fig. 7, it is evident that a significantly large sediment plume was being transported to Galveston Bay. The high TSS concentration, when combined with high storm flows (Fig. 4), suggests that immediately after the storm (9/1/2017), most of the sediment load would export directly to the Gulf of Mexico. In other words, sediment transport shortly after the storm is dominated by advection over deposition due to high flow rates discharging into the GB estuary.

Fig. 7
figure 7

Study area maps depicting predicted TSS concentration at each Sentinel-2 pass date: a 08/22/17 pre-Harvey, b 09/01/17 one day after Harvey, c 09/11/17 11 days after Harvey, and d 10/01/17 one month after Harvey

Ten days later (on 09/11/17), TSS concentrations had decreased but were still above 25 mg/L for much of the bay surface. Notably, flow rates returned to pre-storm levels (Fig. 4), and the sediment plume export to the Gulf of Mexico had ceased. Much of the remaining sediment mass can be expected to be deposited to the bay floor (Du et al. 2019) due to lower water velocities. As stated above, TSS concentrations continued to be in a range that stresses aquatic biota across much of the bay’s surface. Even 11 days after the storm, modeled TSS concentrations suggested continued aquatic life impacts. Heavy metal and organic pollutant loads associated with stormwater runoff are likely to have been transported directly to the Gulf of Mexico (Kiaghadi and Rifai 2019) with little deposition in Galveston Bay for the initial days after the storm. Finally, in the last image on 10/01/17, 31 days after the storm arrival, TSS concentrations returned to near-normal levels, similar to their concentration counterparts observed before the storm. Once discharge slowed between 1 and 11 days after the storm (Fig. 4), deposition likely surpassed advection as the dominant mechanism.

Conclusions

The flooding, loss of life, and economic impacts of extreme weather events such as Harvey typically receive significant amounts of attention. Less publicity concerning the environmental impacts of these events is the norm due to the paucity of data. Satellite imagery offers an essential and cost-effective resource (once the cloud cover clears) that enables quantification of water quality impacts in near real-time without the need for extensive field campaigns during potentially unsafe post-storm conditions. The methodology and approach presented here contribute to a better understanding of solids transport during hurricanes and severe storms and identifies scope for potential follow-up water and sediment quality studies that would need to be undertaken to assess recovery of systems such as Galveston Bay after Harvey.

It should be noted that the main purpose of this study was not to develop a universal remote sensing algorithm for TSS. Rather, it was to demonstrate how remote sensing tools can be used to quantify water quality and ecosystem impacts in the aftermath of an extreme climatic event. While the developed algorithm might not be applicable to other waterbodies, the methodology used in this work could be easily applied to any area where both spectral and water quality data are available. Remote sensing can greatly reduce the cost, time, and safety risks for data acquisition in comparison to conventional measurement techniques. This is of particular importance during extreme climatic events. Water quality records often miss peak events, especially for TSS, due to safety risks associated with elevated flood flows and levels. Additionally, studies of sediment transport and deposition that rely on point-based field data collection can be augmented with the two-dimensional surface concentration gradient that satellite imagery provides. The research demonstrates the importance of remotely sensed images for characterizing water quality during floods and hurricanes and filling knowledge gaps in water system response and sediment fate and transport.