Introduction

Suspended sediments in shallow lakes can impact the physical and chemical environment of the water column through resuspension and transportation and can also alter the light quality, subsequently affecting the growth of phytoplankton (Jin and Ji 2004). Furthermore, pollutants and heavy metals of terrestrial origin are accumulated and transported by sediments in aquatic environments, with impacts on the health of both aquatic wildlife and humans (Feng et al. 2012). Currently, a number of large lakes around the world are suffering from serious environmental problems related to contaminated sediment associated with climate change and human activities.

The following three methods are primarily used to study lake sediment dynamics: ship-based surveys, numerical modeling, and satellite-based remote sensing. Traditional ship-based surveys collect suspended sediment data directly; however, the sparse spatial and temporal density of these data cannot fully represent spatial and temporal information (Puls et al. 1994). Numerical modeling, which is widely applied to study sediment dynamics in inland lake waters (Chao et al. 2008; Lee et al. 2005; Teeter et al. 2001; Wang et al. 2013), benefits from recent advances in computational fluid dynamics and predicts suspended sediment patterns at detailed spatial and temporal resolution based on formulas that define and calculate the physical process of sediment transport (Stroud et al. 2009). Satellite-based remote sensing observations generate a synoptic picture of sediment concentrations throughout a region of interest, and this approach is commonly used to study long-term and short-term sediment processes in lakes (Feng et al. 2012; Schiebe et al. 1992). Satellite observations are also a valuable source of data for numerical model evaluations and provide high-resolution data over multiple spatial and temporal scales. Therefore, a number of studies have combined numerical modeling and remotely sensed data, contributing to scientific research on suspended sediment dynamics in ocean and inland water environments (Chen et al. 2010; Kouts et al. 2007; Miller and McKee 2004; Pleskachevsky et al. 2005).

However, both numerical modeling and satellite observations involve uncertainties. In numerical modeling, several types of errors are inherent in predictions of sediment transport, including errors in the governing equations of the numerical model, which imperfectly describes the complex physical processes involved in sediment dynamics, and errors when simplifying equations during numerical calculations (Gregg 2008). In addition, current models are far from perfect and are subject to uncertainty with regard to model parameters and model input data, including bathymetry and initial and boundary conditions (Natvik 2003). Uncertainty in satellite observations may occur because of onboard errors in satellite sensors and the effects of atmospheric conditions on signal transmission. A high level of uncertainty also stems from the interpretation methods or inversion models by which suspended sediment concentrations are indirectly retrieved from remotely sensed signals (Huang et al. 2008).

Fortunately, data assimilation provides a useful tool for reducing such uncertainties and improving numerical model results. Through data assimilation, numerical models are integrated with measurements in a manner that respects the system’s dynamics and acknowledges measurement errors. Indeed, this approach prevents a model from deviating too far from reality, thus achieving more accurate predictions. Data assimilation techniques were pioneered by meteorologists in response to their need for accurate weather predictions (Daley 1991) and are now widely used in complex models of ocean dynamics and land surface and hydrological processes (Carton et al. 2000; Clark et al. 2008; Dumedah and Walker 2014; Larsen et al. 2007). Although a number of studies on suspended sediment assimilation have been reported in recent years (Margvelashvili et al. 2013; Smith et al. 2011; Zhang et al. 2014), these studies have focused on sediment dynamics in oceanic and coastal waters and not in inland lakes.

The purpose of this study is to report the findings from our assimilation of remotely sensed sediment data into a sediment transport model. Sediment transport in China’s largest freshwater lake, Poyang Lake, is simulated using the explored assimilation scheme. Understanding and predicting the movement of suspended sediment in Poyang Lake is important because many of the contaminants of concern in the lake’s water are associated with sediment particles, and previous studies have indicated that sediment input from surrounding rivers is a major source of pollutants in the lake (Xiang and Zhou 2011). Pollutants are absorbed into lake water during sediment transport, deposition, and resuspension, causing harmful effects on local ecosystems (Luo et al. 2008; Yuan et al. 2011).

To date, limited research has been devoted to sediment transport modeling of Poyang Lake. Zhang et al. (2015a) built and evaluated a sediment transport model of the lake, and several other studies based on remotely sensed data have contributed to our understanding of the suspended sediment dynamics of this lake. However, most of these previous studies have investigated the development of sediment retrieval algorithms from satellite images (Feng et al. 2012; Wu et al. 2013; Yu et al. 2012), and none have incorporated historical satellite data into a numerical model to examine the performance of the model as the system evolves over time and space. Therefore, we combine numerical model simulations with remote sensing observations of Poyang Lake using a data assimilation method in the present study.

The remainder of this paper is organized as follows. The “Study area and available data” section introduces the case study region and available datasets. The “Methods” section describes the satellite data processing method, the sediment transport model of the lake, and the data assimilation scheme. The “Results and discussion” section presents the hydrodynamic model validation results and discusses the data assimilation results, including the selection of parameters for optimal interpolation (OI), and the spatial and temporal effects of the assimilation on the model. The final section, “Conclusions,” presents our conclusions.

Study area and available data

Study area

Poyang Lake (28.37° to 29.75° N, 115.78° to 116.75° E) is located in northern Jiangxi Province at the junction of the south bank of the Yangtze River (Fig. 1) and is the largest freshwater lake in China. The elevation of the lake bed generally decreases from south to north by a total of approximately 20 m. The lake has an average water depth of 8.4 m and a storage capacity of 27.6 billion m3 when the water level is 21.7 m. Geographically, the lake is divided into the following two parts: the southern lake, which is large and shallow, and the northern lake, which is narrow and deep. The lake receives water containing a large amount of suspended matter from five rivers (Ganjiang, Fuhe, Xinjiang, Raohe, and Xiushui) and flows into the Yangtze River at Hukou. One of the most important wetlands in the world, which is recognized by the International Union for the Conservation of Nature, is located at Poyang Lake, and the submerged aquatic plants and their rich diversity provide a habitat for hundreds of thousands of species. The lake hosts a diverse array of local ecosystems, wildlife habitats, and human-centered socio-economic activities. However, reports indicate that the water quality of Poyang Lake has declined in recent years and that this decline has led to numerous environmental problems. For example, a rapid decrease in the fish population has been attributed to decreased water quality and poor environmental conditions (Wu et al. 2007).

Fig. 1
figure 1

Study area and measurement sites

Available data

Data used in this study include the following: water level, water discharge, and sediment flux data obtained from hydrological stations; wind field datasets from meteorological stations; in situ measured water reflectance, sediment concentration and current velocity data; and charge-coupled device (CCD) images from the Huanjing (HJ) satellite.

The water level, water discharge, and sediment flux data were obtained from hydrological stations at Poyang Lake from January 1 to September 30, 2011. Daily water level data were collected at three stations: Xingzi, Duchang, and Kangshan (Fig. 1). Each of the five tributaries in the Poyang Lake basin has a hydrological station (Fig. 1) that gauges stream discharges into the lake. Of these hydrological stations, Wanjiabu measures the discharge of Xiushui, Waizhou measures the discharge of Ganjiang, Lijiadu measures the discharge of Fuhe, Meigang measures the discharge of Xinjiang, and Hushan and Dufengkeng measure the discharge of Raohe. Water discharge and sediment flux data were collected daily from these hydrological stations, and water level and water flux data were collected from the Hukou station, which is located at the junction of Poyang Lake and Yangtze River (Fig. 1).

Daily wind field data from the meteorological station closest to Poyang Lake (Boyang station, see Fig. 1) were collected from the China Meteorological Data Sharing Service System (http://cdc.cma.gov.cn/).

A cruise survey was conducted from July 15 to July 24, 2011 using a small fishing boat, and surface water samples were collected from 47 in situ sites. For each water sample, 500 ml was collected and immediately filtered through a pre-weighed Whatman Cellulose Acetate Membrane filter with a diameter of 47 mm and a nominal pore size of 0.45 μm. The filter was stored in a desiccator and then combusted for 3 h in a 500 °C oven. Later, the filter was removed from the desiccator and weighed in the laboratory. An analytical balance with a precision of 0.01 mg was used to weigh the filter, and the sediment concentration was determined by normalizing the weight difference with the filtered water volume. The radiances of water, sky, and a reference plaque were measured at the sediment measurement sites using an ASD FieldSpec Dual spectrometer, and the reflectance at the water surface was calculated using a previously described approach (Mobley 1999). During the survey, the current velocities were measured using isochronous shipboard acoustic Doppler current profilers (ADCPs) on two tracks (Fig. 1). The first track was located near the hydrological station at Duchang, and the current velocities at this station were measured on July 19, 2011. The second track was located in the northern part of the lake (the waterway), and the current velocities were measured on July 20, 2011. The ADCPs acquired the vertical distribution of the current velocity on the two tracks. Six levels of the current velocities in the vertical distribution at each ADCP sample points were selected, and depth-averaged velocities were calculated. These in situ depth-averaged current velocities are used to validate the hydrodynamic model in this study.

Cloud-free HJ-1A/1B CCD images were obtained on July 5, July 21, July 23, and July 25, 2011. The HJ-1A and HJ-1B satellites were launched by the China Center for Resources Satellite Data and Application (CRESDA) on September 6, 2008 (http://www.cresda.com/n16/n92006/index.html), and they have a sun-synchronous circular orbit with a frequent revisit time of 2 days. The multispectral CCD sensors on board the two satellites have three visible bands (430–520, 520–600, and 630–690 nm) and one near-infrared band (760–900 nm) with a spatial resolution of 30 m; such parameters are considered important features for environmental monitoring.

Methods

Remote sensing sediment inversion

Suspended sediments are highly reflective and easily detected in visible wavelength satellite images. Accordingly, the application of satellite data to studies of sediment transport in marine and aquatic systems is an active area of research. To utilize satellite image data to derive sea surface sediment concentrations, we must determine a relationship between the suspended sediment concentrations (SSC) and water reflectance. Such relationships have been proposed through semi-analytical algorithms based on radiative transfer theory (Dekker et al. 2002; Volpe et al. 2011) and in the context of empirical regression methods. A number of empirical regression relationships have been tested to establish the remote sensing retrieval models for suspended sediments, including linear, exponential, and logarithmic statistical relationships (Doxaran et al. 2002; Han et al. 2006; Miller and McKee 2004). In this study, we attempt to establish an empirical regression algorithm to retrieve the sediment concentrations from HJ-CCD images.

Based on in situ water reflectance and SSCs at all measurement sites, the optimal relationship for representing the model (with the square correlation coefficient of 0.94) is identified as follows:

$$ SSC= \exp \left[0.06785+89.677\times \frac{Rss(660)+Rss(830)}{Rss(560)}\right] $$
(1)

where SSC denotes the suspended sediment concentration (mg/L) and Rss(660), Rss(830), and Rss(560) denote the water reflectance at 660, 830, and 560 nm, respectively. Figure 2 shows a scatterplot of in situ water reflectance and SSC.

Fig. 2
figure 2

Paired in situ suspended sediment concentration and water reflectance

The four available HJ-CCD remote sensing images from July 2011 are processed using ENVI 4.5 software, and radiometric calibrations are performed using coefficients provided with the image (e.g., gains and offsets). The FLAASH module in ENVI is applied to correct for the atmosphere based on the location, sensor type, and ground weather conditions observed on the day the image was acquired (Berk et al. 2002) and to obtain the remotely sensed water reflectance at the water surface. The surface sediment concentration was then retrieved from the water reflectance images based on this sediment retrieval model (Eq. (1)).

Sediment transport model description

The Delft3D-FLOW numerical modeling system is used to set up the hydrodynamic and sediment transport model for Poyang Lake. This system has been developed for modeling unsteady water flow, cohesive/non-cohesive sediment transport in shallow seas, estuarine and coastal areas, and rivers and lakes (Borsje et al. 2008; Lesser et al. 2004). The Delft3D-FLOW module performs hydrodynamic calculations by solving continuity and horizontal momentum equations for given initial and boundary conditions in two or three dimensions using an implicit finite difference process (Alternating Direction Implicit (ADI) method) on a staggered (spherical or orthogonal curvilinear) grid. Sediment transport is simultaneously calculated based on the advection–diffusion equation (Lesser et al. 2004).

The two-dimensional advection–dispersion equation for the fine sediment transport model in Delft3D-FLOW is calculated as follows:

$$ h\frac{\partial hC}{\partial }+h\frac{\partial uC}{\partial x}+h\frac{\partial vC}{\partial y}=\frac{\partial }{\partial x}\left(h{A}_H\frac{\partial C}{\partial x}\right)+\frac{\partial }{\partial y}\left(h{A}_H\frac{\partial C}{\partial y}\right)+{\displaystyle \sum_{i=1}^n{S}_i} $$
(2)

where C denotes the sediment concentration, h denotes the water depth, u and v denote the two components of the velocity vector, A H denotes the eddy diffusivity coefficient, and S i denotes the source/sink term that describes the vertical flux between the bed and water column. These fluxes are the result of erosion and deposition, which are calculated in Eqs. (3) and (4), respectively.

The cohesive sediment deposition rate is calculated using Krone’s worldwide deposition formula (Krone 1962). The bed erosion rate of cohesive sediment is determined by the classic formula given by Partheniades (Partheniades 1965). The formulas are as follows:

$$ {R}_D=\Big\{\begin{array}{cc}\hfill {W}_s{C}_b\left(1-{\tau}_b/{\tau}_{cd}\right)\hfill & \hfill for\;{\tau}_b<{\tau}_{cd}\hfill \\ {}\hfill 0\hfill & \hfill for\;{\tau}_b\ge {\tau}_{cd}\hfill \end{array}, $$
(3)
$$ {R}_E=\Big\{\begin{array}{cc}\hfill {E}_b\left({\tau}_b/{\tau}_{ce}-1\right)\hfill & \hfill for\;{\tau}_b>{\tau}_{ce}\hfill \\ {}\hfill 0\hfill & \hfill for\;{\tau}_b\le {\tau}_{ce}\hfill \end{array}, $$
(4)

where C b denotes the near-bottom layer concentration, E b denotes the erosion constant, W s denotes the settling velocity, τ cd denotes the critical shear stress of deposition, τ ce denotes the critical shear stress of erosion, and τ b denotes the bed shear stress. The x and y components of the bed shear stresses are calculated as follows:

$$ \left({\tau}_{bx},{\tau}_{by}\right)={C}_d\sqrt{u_b^2+{v}_b^2}\left({u}_b,{v}_b\right) $$
(5)

where u b and v b denote the x and y components of the near-bottom velocity and C d denotes the drag coefficient, which is determined by matching a logarithmic bottom layer to the model at height z above the bottom.

Model development

To simulate the long-term current velocity and inundation area dynamics, a hydrodynamic model was established by Zhang et al. (2015b) based on the Delft3D-FLOW model system. The median size distribution of suspended sediments from field survey data for Poyang Lake during the wet season has been reported as fine (mostly less than 74.48 μm) (Zhang 2012). Therefore, a fine sediment transport model was coupled with the established hydrodynamic model and calibrated and validated with in situ sediment (Zhang et al. 2015a). Orthogonal curvilinear model grids were used in the hydrodynamic and sediment transport models, with the grid size varying from 200 to 300 m. The bathymetry at each computational node of the model grids was interpolated from the bathymetry data measured by the Changjiang Water Resources Commission of China (http://eng.cjw.gov.cn/).

In this study, the model run time is extended based on the model by Zhang et al. (2015a), and hydrodynamic and sediment transport from January 1, 2011 to September 30, 2011 is simulated. To meet the Courant–Friedrich–Levy (CFL) criteria for a stable solution (Hydraulics 2006), the model time step is set as 30 s. Daily wind data from Boyang meteorological station are used for the spatial-uniform water surface driving force. The river flow rates and sediment concentrations measured from the hydrological stations of the five tributaries are prescribed in the river inlets as the upper inflow boundary condition (Fig. 1). The lower open boundary is set at the junction between the lake and Yangtze River at Hukou, and the daily water levels measured at Hukou station are prescribed at the grid points along the open boundary. The Neumann boundary condition is used for the sediment concentration at the open boundary. The model is initialized to a current velocity of zero, and the water level is initialized to the mean water level of the four hydrological stations (Xingzi, Duchang, Tangying, and Longkou) in Poyang Lake on January 1, 2011. The sediment concentration is initialized to the mean field measured sediment data in July 2011. The model is run with the calibrated parameters in Zhang et al. (2015a) until the desired month is reached, at which point the remotely sensed sediment data are assimilated into the model.

Assimilation scheme

A widely used OI algorithm for assimilating ocean data is employed in this study (Carton et al. 2000; Fox et al. 2002). The OI method relies on model forecasting and observations using a least squares estimator to determine the statistically optimal state of the ocean. This method is simple to implement and has a relatively small computational cost, especially for extremely non-linear high-dimensional ocean model systems. Sediment observations from remote sensing images are initially interpolated into a model grid using OI and a model-forecast field. The number of model grids is assumed to be n, and the number of pixels representing the remotely sensed sediment data is assumed to be m. OI is a simplified version of the Kalman Filter, and the scheme of assimilating remote sensing observations using OI can be represented by the following equation:

$$ {C}_k^a={C}_k^f+{W}_k\left({C}_k^{rs}-H{C}_k^f\right) $$
(6)

where C denotes the n-dimensional vector of sediment concentration, with superscripts a, f, and rs denoting the analysis, model forecast, and remote sensing observations, respectively; subscript k denotes the assimilation time when remotely sensed sediment data are available; the observation operator H is an n × m matrix that maps the data from the model grids into remote sensing measurement space; and W is an n × n matrix of weights which is generally called Kalman gain. To minimize the error variance of C a k , which is generally called “analysis” or “analyzed state” in data assimilation, W k may be calculated as follows (Daley 1991):

$$ {W}_k={P}^f{H}^{\mathrm{T}}{\left(H{P}_k^f{H}^{\mathrm{T}}+R\right)}^{-1} $$
(7)

where P f is the n × n error covariance matrix of the sediment concentration field from the model forecast and R is the n × n error covariance matrix of the sediment concentration field from remote sensing images. After determining W k , which indicates the influence of each observation on the analysis, the analyzed state is then obtained by Eq. (6). The model is then integrated to the next forecast time, and the analyzed state becomes the initial condition until the next assimilation time.

To perform OI, the model forecast error covariance matrix P f and observation error covariance of remotely sensed data R in Eq. (7) must be determined. In the present study, it is assumed that remote sensing observation errors follow a Gaussian distribution and the correlations do not occur between observation errors. Therefore, the error covariance matrix R is a diagonal matrix, where the error variances of remote sensing observations are located on the main diagonal of the matrix and all other matrix elements are 0. The error variance of the remotely sensed sediment concentrations is obtained through a comparison of in situ measured sediment concentrations at all in situ sites, and it is computed using the following formula:

$$ {\sigma}_{rs}^2=\frac{1}{N}{{\displaystyle {\sum}_{i=1}^N\left({C}_i^{rs}-{C}_i^{in\kern0.5em stu}\right)}}^2 $$
(8)

where σ rs 2 denote the error variance of the remotely sensed sediment concentrations; C in stu i and C rs i denote the in situ measured and remotely sensed sediment concentration at the ith in situ site, respectively; and N denotes the number of in situ measurement sites.

The model forecast error covariance P f is usually specified as an error correlation models. A number of schemes to calculate forecast error correlations have been proposed and applied in oceanic data assimilation (Høyer and She 2007; Larsen et al. 2007). In the present study, a widely used exponential correlation model is chosen to define the error correlation. The model is based on the assumption that the forecast errors follow a Gaussian distribution and that the error correlation decreases exponentially with the square of the distance (Mangiarotti et al. 2013). The formula for the correlation model is written as follows:

$$ \rho = \exp \left[-\frac{{\left(\varDelta x\right)}^2+{\left(\varDelta y\right)}^2}{L^2}\right] $$
(9)

where ρ is the forecast error correlation, Δx and Δy are the distances between two forecast grid points in the x and y directions, respectively, and L is the error correlation length, which limits the influence of interpolated data within a fixed region of the OI (Xie and Zhu 2010). To formulate the error covariance matrix P f, the standard error variances should be determined. The standard error variances are obtained by a classic method by which model outputs from a period of simulation time are taken and the mean value and error variance for each model grid are calculated (Oke et al. 2002; Xie and Zhu 2010). In the present study, 61-day model outputs from June to July are selected to calculate the forecast error variances. The model outputs from June to July are selected because sediment transport during this period can represent the typical sediment dynamics during the wet season, and because this period covers the full assimilation period.

Two schemes are explored for determining the observation operator H. The first scheme utilizes the remotely sensed data at the pixel nearest to a model grid as the data for the model grid. Therefore, the operator H 1 is an n × m matrix with the jth row given by H 1,j  = (0,…,0,1,0,…,0), where the position of 1 matches the jth observation to a component of the forecast vector. Because the spatial resolution of remotely sensed data is finer than the resolution of model grids and multiple remotely sensed pixels fall within one model grid, a classic super-observations method (Oke et al. 2009; Pan et al. 2014; Sakov et al. 2012) is used as a second scheme to determine the operator H 2. This method finds the mean of remotely sensed data within one model grid, and it is then used as the data for that model grid. Assuming that there are k remotely sensed pixel data in the sth model grid, the sth row in the inverse matrix (with a dimension of m × n) of H 2 is composed of k elements with a value of 1/k and n-k elements with a value of 0.

In the present study, the hydrodynamic and sediment transport model is first run with the provided input data and parameters and then validated by in situ measurements. The model runs are then conducted by sequentially assimilating the four sediment concentration images obtained in the “Remote sensing sediment inversion” section into the model using OI schemes. To effectively assimilate the data, the optimal forecast error correlation length L and observation operator H are determined. By repeatedly assimilating remotely sensed sediment data using different correlation lengths and two different observation operators, H 1 and H 2, the set of parameters producing the best model prediction of sediment concentrations is selected. In this study, the root-mean-square error (RMSE) is calculated to evaluate the model’s performance as follows:

$$ RMSE\left(m,o\right)=\sqrt{\frac{{\displaystyle {\sum}_{i=1}^N{\left({C}_i^m-{C}_i^o\right)}^2}}{N}} $$
(10)

where C denotes the N-dimensional vector of sediment concentration, superscript m denotes the model-predicted results, superscript o denotes the measurements, and N denotes the number of measurements. The measurements are derived from remote sensing observation and in situ measurements, allowing us to evaluate the model from different perspectives.

Results and discussion

Hydrodynamic model validation

Figure 3 shows a comparison of the water levels from the model and hydrological stations (Xingzi, Duchang, and Kangshan) from June 1 to September 30, 2011, revealing that the model results generally match the measured water levels and could reproduce dynamic changes in the water level. The R 2 values are all greater than 0.99 for the simulated water levels at the three stations, with RMSE values of 0.17, 0.25, and 0.21 m. Figure 4 demonstrates that the simulated velocities are reasonably consistent with the measurements along the two ADCP tracks. For the simulated velocities along the two ADCP tracks, the RMSEs are 0.028 and 0.031 m/s, and the R 2 values are 0.91 and 0.89; these results indicate that the model’s performance is satisfactory. In general terms, the hydrodynamics of Poyang Lake can be accurately predicted by the investigated model.

Fig. 3
figure 3

Model validation of the water level

Fig. 4
figure 4

Comparison of the simulated and ADCP-measured velocity

Assimilation results

The assimilation experiments are conducted by sequentially updating the simulated sediment concentrations with data from four remote sensing sediment images through OI schemes. A series of assimilations are repeatedly performed by changing the error correlation length and using the observation operators H 1 and H 2. Because the model grid sizes varied from 200 to 300 m, trials of the error correlation lengths with 250, 500, 750, …, 1750, and 3500 m are performed. The RMSE values of the predicted sediment concentration at the in situ measurement sites are calculated for each assimilation experiment, and Table 1 provides a summary of the RMSE values. The RMSE decreases when the error correlation length increases from 250 m to approximately 1000, indicating that an improvement in assimilation ability occurs along with this increase. The error correlation length largely controls the impact area of the observations. When an error correlation length is undersized, the observations will have little effect on the model results at the observation sites. This explains why the RMSE is large when using 250 m as the correlation length. However, the RMSE starts to increase as the error correlation length surpasses 1250 m. These results are in accordance with those of an assimilation experiment conducted by Pan et al. (2014) and confirm the theory of Hamill et al. (2001), which states that as the error correlation length increases, the error typically grows and eventually interferes with the correct covariance. This interference causes a loss of the assimilation ability when the correlation length is too large. Table 1 shows that when the error correlation length is constant, lower RMSE values are observed with H 2 compared with H 1. The minimum value of the calculated RMSE is 13.1 mg/L. Therefore, an error correlation length of 1250 m and observation operator H 2 are selected as the optimal parameters for assimilating remotely sensed sediment data into the sediment transport model.

Table 1 RMSEs (mg/L) for predicted sediment concentrations from the assimilation model using different error correlation radii (L) and two types of observation operators (H 1, H 2)

Figure 5 shows a comparison between in situ measured sediment concentrations and predicted sediment concentrations from the model with and without the assimilation of remotely sensed sediment concentrations. This figure demonstrates that the model with assimilation generates results that are better correlated with in situ measurements, and more accurately reproduces the sediment dynamic at the in situ sites. The RMSE of the predicted sediment concentrations from the model with assimilation is reduced by 39.4 %. Thus, the assimilation scheme satisfactorily improves the performance of the model’s sediment transport prediction capabilities.

Fig. 5
figure 5

Comparison of the measured sediment concentrations and simulated results from the model with and without assimilated remote sensing sediment data

Spatial effect of assimilation

One of the advantages of satellite remote sensing is that it can effectively acquire large-scale environmental information from the Earth’s surface. Satellite remote sensing is believed to capture a relatively accurate record of the spatial distribution patterns of the state of the Earth’s surface environment. Therefore, in the present study, remote sensing observations are compared with the model-predicted results to evaluate the spatial effect of the assimilation.

Inconsistencies in the spatial distribution of sediment concentrations are detected by calculating the difference between the remotely sensed sediment concentrations and the sediment concentrations predicted with the model (without assimilation). Figure 6 shows the spatial distribution of the average difference between the remotely sensed and model-predicted sediment concentrations (RS-Model) for the four assimilation days (July 5, July 21, July 23, and July 25), revealing that the differences in sediment concentrations range from −20 to 20 mg/L in most areas. The differences are largest in the deepwater area from Duchang to Xingzi and reaches 100 mg/L, which indicates that the sediment data values captured by remote sensing satellites are much greater than the predicted results from the model. The high concentrations captured by the satellite are most likely the result of resuspension of a large amount of sediment by sand dredging activities; these activities occur frequently during the wet season in this region, as noted by multiple studies (Feng et al. 2012; Liu 2012). Thus, the sediment transport model underestimates the sediment concentrations in this area because it does not consider the effect of dredging activities on sediment advection, resuspension, and deposition. Therefore, the model’s ability to predict the spatial distribution of sediment concentrations is expected to improve by capturing this error information and correcting the model results through the assimilation of remotely sensed sediment.

Fig. 6
figure 6

Spatial distribution of the average difference between remotely sensed and model-predicted sediment concentrations (RS-Model) on July 5, July 21, July 23, and July 25

Figure 7 shows the spatial distribution of predicted sediment concentrations from the model with and without assimilation and remote sensing observations on July 23 and July 25. A significant difference in the spatial distribution pattern between the predicted sediment from the model without assimilation and the remotely sensed sediment is observed. The predicted sediment distributions from the model with assimilation are generally spatially consistent with remotely sensed sediment distributions over the entire area on July 23 and July 25 because accurate spatial distribution information is integrated into the model by updating the sediment concentration fields on July 21 and July 23. The predicted sediment concentrations from the model with assimilation increase, and the spatial distribution pattern is more closely aligned with the remote sensing observations in the deep area from Duchang to Xingzi. In this area, the sediment concentrations from the model without assimilation are under-predicted because sand dredging effects are not considered. Therefore, the sediment dynamics induced by dredging activities are integrated into the model through assimilation, generally reducing the model’s sediment transport prediction errors. Moreover, the spatial distribution patterns for the river mouth areas, especially near the Xinjiang River mouth on July 23, are reasonably improved. As demonstrated in Fig. 7, the predicted sediment concentrations from the model with assimilation are generally greater than those from the remote sensing data. Because the remote sensing data can only represent the surface concentrations while the model gives the depth-averaged concentrations, the greater concentrations of the latter can be considered as reasonable and an improvement of the model resulting from the assimilation. In general terms, the accuracy of sediment spatial distribution predictions can be improved through the assimilation of remote sensing images into the sediment transport model.

Fig. 7
figure 7

Comparison between the spatial distributions of remotely sensed sediment and predicted results from the model with and without assimilation

The RMSE of model-predicted sediment is calculated to assess the capability of assimilation on prediction of spatial distribution based on the remote sensing observations. The RMSE values of the predicted sediment from the models without and with assimilation on July 23 and July 25 are displayed in Table 2. The RMSE values of the predicted sediment from the model with assimilation are markedly reduced compared with those from the model without assimilation. The relative reductions of RMSE are 19.7 and 34.6 % for July 23 and July 25, respectively, indicating that the predicted results with assimilation are much more consistent with the remote sensing observations in space. Generally, assimilating remotely sensed sediment can have positive spatial effects on sediment transport modeling.

Table 2 RMSE values (mg/L) between remote sensing observations and predicted sediments from the model with and without assimilation, and the relative improvement of RMSE

Temporal effect of assimilation

However, the sediment transport model cannot always be improved without sequentially assimilating remote sensing images because of errors in the parameters and boundary conditions of the model, and the effect of assimilation on the model would vanish over a long modeling period. To analyze the temporal effect of assimilation, model results from three points of interest in different areas of the lake are selected (see Fig. 8). The background of Fig. 8 shows the average velocity vectors of the four assimilation times. Point A is in the narrow northern lake, point B is in the junction between the northern lake and the southern lake, and point C is in the southern lake (the main part of the lake).

Fig. 8
figure 8

Location of the three points of interest. The background is the average velocity vectors of the four assimilation times

Here, we define the “model with a single update” as assimilating only one of the four remote sensing images into the model; therefore, four single-update experiments are conducted for four remote sensing images. Figure 9 shows the predicted sediment concentrations from the model without assimilation and from that with a single update at the three points of interest from July 1 to Aug 15. The results from the model with a single update on July 5 and July 21 are presented in the Fig. 9. The comparisons show that the predicted sediment dynamics at the three points changes after integrating the model results with the remotely sensed sediment data. Nevertheless, the predicted results from the models with and without assimilation generally show similar trends. As the model continued to forecast, the predicted results from the model with assimilation (single update) gradually align with those from the model without assimilation, indicating that the effects of assimilation on the model predictions are temporally restricted. Figure 9 also reveals that the length of the temporal effect varies for the different points of interest.

Fig. 9
figure 9

Comparison of the simulated results from the model with and without assimilating the remotely sensed sediment data on July 5 and July 21 at three points of interest

In the present study, the length of the temporal effect of assimilation is defined as the time span when the absolute relative difference (ARD) between the predicted results from the model with and without assimilation is greater than 5 %. The ARD is calculated as follows:

$$ ARD= abs\left(\frac{C_2-{C}_1}{C_1}\right)\times 100\% $$

where C 1 is the predicted sediment concentration from the model without assimilation and C 2 is the predicted sediment concentration from the model with assimilation. Table 3 displays the length of the temporal effect of the assimilation at the four assimilation times for the three points of interest and shows that the temporal effect varies with assimilation time and differs among the three points of interest. The longest effect time is 560 h (23.3 days), identified at point C on July 25, and the shortest effect time is 65 h (2.7 days). In general, the temporal effect occurs for an average of 259 h (10.8 days). The average length of this effect across the three points for each assimilation time is approximately 200 h, although the average length of the temporal effect over the four assimilation times for different points varies dramatically, indicating a spatial difference in the temporal effect of the assimilation on the model predictions. Among the three points of interest, the length of the temporal effect at the three points is longest at point C in the southern lake, and shortest at point B at the junction of the southern lake and northern lake, and that of point A in the northern lake is between these values.

Table 3 Length of the temporal effect for the points of interest at four assimilation times

The differences among the different points of interest in the lengths of the temporal effect may be due to spatial differences in the current velocities. Figure 10 compares the time series of current velocities from July 1 to August 15 and shows that the current velocities at point C are generally less than those at points A and B. Because current velocity dominates the transport, deposition, and erosion of suspended sediment, the variation rate of the sediment concentrations would be much lower at point C; therefore, the information obtained from the remote sensing data at that point would remain longer in the numerical model, thereby increasing the impact of the assimilation on the model. However, despite a greater current velocity recorded at point A than that at point B, the temporal effect is not longer at point B. This result may be related to point A’s location in the lower reach of the northern lake. Because waters flow from point B to point A (see Fig. 8), point A is not only affected by the new information from remote sensing observations at the assimilation time but is also influenced by information from the sediment transported from point B, which also contains information from the model with assimilation. Therefore, more accurate information is retained at point A, resulting in a longer temporal effect of assimilation on the model’s predictive capabilities. Thus, the length of the temporal effect of assimilation varies spatially and depends on the magnitude and direction of the current velocity.

Fig. 10
figure 10

Comparison of the mean sediment concentrations and current velocity magnitude over the acquisition times of the four assimilated remote sensing images

Conclusions

Our study predicts sediment transport in China’s largest freshwater lake, Poyang Lake, by assimilating remotely sensed suspended sediment based on CCD images recorded by the HJ satellite into a two-dimensional sediment transport model. OI is used as the assimilation method, and the error correlation length and observation operator for OI are determined through a series of assimilation experiments that evaluate sediment predictions based on field measurements. The model with assimilation produces sediment predictions with RMSE values that are 39.4 % lower than the model without assimilation, indicating the effectiveness of the explored assimilation scheme. The spatial effect of assimilation is evaluated based on remotely sensed sediment, revealing that the model with assimilation produced more accurate spatial distribution patterns of suspended sediment than the model without assimilation. The temporal effect of assimilation on the model predictions is explored, revealing that the average length of the temporal effect, which varies spatially, is approximately 10.8 days. The current velocity, which dominates the rate and direction of sediment transport, is most likely the reason for the spatial variation in the length of the temporal assimilation effect on the model predictions.

This study demonstrates that remote sensing images incorporated into a sediment transport model through assimilation can substantially improve predictions of sediment concentrations in Poyang Lake in space and time. This method can be applied to other inland lakes and coastal regions where modeling and prediction of sediment transport and ocean color parameters, such as chlorophyll, are of interest. To achieve a long-term effect on the model’s predictive capacity, future studies could employ multi-platform remote sensing data to narrow the gap in the assimilation time.

In this model of Poyang Lake, the sand dredging effects on sediment transport modeling are not included in the model. However, such effects are eliminated to some extent by assimilating remotely sensed sediment, representing one improvement yielded from the assimilation. In future studies, the physical processes of sand dredging activities can be considered in the sediment transport modeling, and some assimilation schemes, such as the ensemble Kalman filter, and adjoint and variational assimilation schemes, can be used to estimate the model parameters in relation to sand dredging activities.

Further work can also explore assimilation schemes to improve sediment predictions for the entire vertical water column by assimilating surface sediment concentrations into a three-dimensional sediment transport model. Error correlations of the predicted sediment concentrations between the surface layer and lower column can also be considered in the improved schemes. Moreover, assimilating current velocities into a three-dimensional model could provide valuable data. Because current dynamics largely govern sediment movement in water bodies, it is likely that improving current circulation modeling will result in more accurate predictions of suspended sediment concentrations.