Introduction

Poor water quality threatens ecosystem functioning, biodiversity, and human health (UN 2021; Díaz et al. 2006; Darwish et al. 2021; Makwinja et al. 2022a, 2022b). It has been ranked by World Health Organisation (WHO) as one of the leading causes of death worldwide (WHO 2018; Makwinja 2022). The WHO report showed that 51 million deaths occurred in 1993 and were linked to a cyanobacteria bloom, infectious and parasitic diseases, and in developing countries, it accounted for 44% of all deaths and 71% of deaths of children (Podewils et al. 2004). In 2016, the Lancet Commission on Pollution and Health revealed that water, air, soil, and chemical pollution were responsible for 940,00 deaths in children, two-thirds under age 5. Landrigan et al. (2019) further pointed out that 92% of pollution-related deaths in children occur in low- and middle-income countries. The recent projection indicates that about 1.1 billion population lack clean water and are exposed to cyanobacterial bloom, with the future projection indicating that about 4.8–5.7 billion global human population will face severe water crises by 2050 (Pimentel et al. 2007; WHO 2008; Makwinja 2022). Challenges such as hypoxic conditions, reduction in fish and shellfish production, odor, emission of greenhouse gases, and degradation of cultural and social values are linked to eutrophication—emerging threats to freshwater ecosystem functioning globally—especially with the onset of climatic and anthropogenic drivers (Warner et al. 2010; Yang et al. 2016; Habersack and Samek 2016; Wurtsbaugh et al. 2019; Makwinja et al. 2019, 2021a; Nkwanda et al. 2021; Kosamu et al. 2022). Currently, the blue-green algae represent more than half of the algal biomass in water bodies, with estimates under climatic and anthropogenic drivers indicating upwards (Danaher et al. 2022). In European coastal waters and the USA, the infestation of this species has led to annual economic losses of about US$ 1 billion and US$ 2.4billion (Wurtsbaugh et al. 2019). In Sub-Saharan Africa, these projections could be higher, though no initiatives have been undertaken due to inadequate studies focusing on the concept, suggesting the need for comprehensive water quality monitoring programs in Africa (Holland et al. 2012; Collen et al. 2014).

Effective water quality monitoring programs require developing advanced instruments, monitoring infrastructure such as analytical laboratories, and skilled human resources (Abdel-Dayem 2011). Field-based surveys and laboratory techniques have been used for decades to achieve this objective (Fölster et al. 2014). Nevertheless, the techniques have been proven labor-intensive, costly, and exhibit poor precision, especially for large-scale monitoring (Lambrou et al. 2014; Gholizadeh et al. 2016; Yang et al. 2020). Far-sighted to these challenges, satellite remote sensing has been proposed to provide large-scale, long-duration, and periodic water quality monitoring (Hart and Matthews 2018; Kravitz et al. 2020; He et al. 2020). The technique makes it possible to monitor and identify large-scale water bodies suffering from pollution more effectively and efficiently (Gholizadeh et al. 2016). It has been used since the 1970s and continues to be widely used in water quality assessment worldwide (Duan et al. 2013). Nuisance algal blooms that cause aesthetic degradation resulting in unpleasant odor and possible adverse effects to human health from toxic blue-green algal have been detected by remote sensing (Randolph, et al. 2008). Remote retrieval of chl-a offers valuable insights into monitoring seasonal and spatial chl-a distribution in water bodies, a fundamental factor in understanding the inland water’s trophic and limnological state (Dörnhöfer and Oppelt 2016; Zhu and Mao 2021). Time series remotely sensed reflectance provides a valuable opportunity to advance the basin-scale knowledge of the most productive lakes’ ecosystems (Huang et al. 2014; Cózar et al. 2014). The synoptic perspective provided by satellite data offers valuable information regarding the surface spatial–temporal characteristics of the lake’s primary productivity (Hestir, et al. 2015). Integrating stand-alone radar with optical remote sensing data can detect flood, surface, and volume of shallow freshwater lakes (Henderson and Lewis 2008; Silva et al. 2010). The quantitative measuring of the annual growth pattern of phytoplankton has been possible due to the acceptable temporal resolution (Martin-Platero et al. 2018).

Remote sensing algorithms, such as ocean color sensors such as Coastal Zone Color Scanner, Sea-viewing Wide Field-of-view Sensor (SeaWiFS), and Earth Observation systems such as Medium Resolution Imaging Spectroradiometer (MERIS), meteorological satellites such as Advanced Very High-Resolution Radiometer, and medium to high-resolution land resources satellites such as Landsat-8 (Operational Land Imager (OLI)) and Sentinel-2 Multispectral Imagery (MSI), have been applied in water quality monitoring studies across the globe (Kiselev et al. 2015; Mathews and Odermatt 2015; Oyama et al. 2015; Page et al. 2018; Kravitz et al. 2020; Mittenzwey et al. 1992; Dörnhöfer and Oppelt 2016). For example, satellite-based sensors have been used to assess the European perialpine lake’s trophic state—the information used to frame the Water Framework Directive (Bresciani et al. 2011). In Italy, Pinardi et al. (2015) assessed the potential algal blooms in Mantua Superior Lake using an integration of hydrodynamic modeling and remotely sensed images. Li et al. (2015) applied an extended inherent optical properties (IOP) Inversion Model of Inland Waters (IIMIW) to estimate phycocyanin in inland waters. Li et al. (2021) used Sentinel-2 MSI imagery with a machine learning algorithm to quantify chl-a in Chinese Lakes. Dall'Olmo et al. (2005) assessed the potential application of SeaWiFS and Moderate Resolution Imaging Spectroradiometer (MODIS) for estimating chl-a concentration in turbid productive waters using red and near-infrared (NIR) bands. Manzo et al. (2014) did a sensitivity analysis of a bio-optical model for Italian lakes with a primary focus on Landsat-8, Sentinel-2, and Sentinel-3. Saberioon et al. (2020) estimated chl-a and total suspended solids using Sentinel-2A and machine learning algorithms. Wang and Atkinson (2018) applied a MODIS-derived Forel-Ule index to assess the trophic state of global inland waters. Kuhn et al. (2020) integrated satellite and airborne remote sensing techniques to assess gross primary productivity in Boreal Alaskan lakes.

In Africa, the application of remote-sensing techniques in water quality research has expanded dramatically (Matthews et al. 2012), though the spatial distribution is quite uneven (De Roeck et al. 2008). For example, in South Africa, MERIS and Landsat 8 OLI data have been used by Smith and Pitcher (2015), Matthews and Bernard (2015), Malahlela et al. (2018), and Dzurume et al. (2022), to detect trophic status (chlorophyll-a), cyanobacterial dominance, surface scums, and floating vegetation in inland and coastal waters. In Kenya, Tebbs et al. (2013) used Landsat ETM + to measure cyanobacterial biomass in Lake Bogoria-hypertrophic saline-alkaline flamingo lake. Horion et al. (2010), Knox et al. (2014), Cózar et al. (2014), and Gidudu et al. (2021)focused their studies on Lakes Tanganyika, Kivu, and Victoria. Acker et al. (2008) used a SeaWiFS and data from MODIS to detect the chl-a seasonal variability in the northern Red Sea. In Nigeria, Ayeni and Adesalu (2018) integrated field-based techniques and Landsat 7 (ETM +) and Landsat 8 (OLI) remote sensing-based techniques to determine the chl-a concentration in the Lagos Lagoon. In Ghana, Rani et al. (2019) applied a NIR algorithms-based model for chlorophyll-a retrieval in highly turbid Inland Densu River Basin in South-East Ghana, West Africa. In Lake Chad, Buma and Lee (2020) applied Sentinel-2 and Landsat 8 images to estimate chl-a concentration. In Mozambique, Kyewalyanga et al. (2007) applied a non-linear model that uses satellite-derived chl-a to estimate water column primary production in Delagoa Bight. Kapalanga et al. (2021) used remote-sensing-based algorithms such as Landsat 8 to monitor water quality in Olushandja Dam, North-Central Namibia. In Ethiopia, Womber et al. (2021) estimated suspended sediment concentration in Lake Tana, using MODIS-Terra and in situ data. While there is an exponential growth of studies in Africa using remote sensing techniques to monitor inland water bodies, the technique is still in its infancy in some countries, such as Malawi (Dube et al. 2015; Chavula et al. 2009). We, therefore, estimated the trophic status of Lake Malombe in Malawi-a lake likely to be affected by eutrophication and algal bloom. We integrated in situ data with the Sentinel-2 MSI algorithm—an algorithm with two bands at the central wavelength of 665 nm or 705 nm and relatively narrow spectral bandwidth (31 nm or 15/16 nm), which corresponds to a strong absorption feature in the red and reflectance peak of chl-a (Page et al. 2018) to estimate spatial and temporal dynanics of chl-a in Lake Malombe.

Materials and methods

Study area

Lake Malombe (Fig. 1), with an extensive catchment area of 3387km2, is located in Southern Malawi (Makwinja et al. 2021b). It is connected by a 19 km stretch of Shire River-the outlet of Lake Malawi in the South East Arm. The lake lies within the Afro-Arabian Rift Valley—the most comprehensive rift valley system on Earth, extending from Jordan in southwestern Asia, passing through Ethiopian highlands to Mozambique (Makwinja et al., 2022a). The lake is shallow, with an average depth of around 2–2.5 m, a maximum depth of 7 m, and a total area of 420km2. It is fed by the most eutrophic water from Lake Malawi and enriched streams flowing into it from its highly populated catchment area and by recycling nutrients from sediments due to its shallowness, making it one of the most productive lakes in Africa. Native and migrant birds such as waterfowl, cormorants, kingfishers, and others form part of the lake ecology. Lake Malombe’s temperatures range from 13 to 35 °C, and the average annual rainfall ranges from 600 mm on the rift valley floor to 1600 mm in the mountainous areas (Makwinja et al. 2022c). The lake catchment is characterized by distinct habitats and is rich with aquatic and terrestrial flora and fauna, contributing US$ 124.36 million/year—about 1.97% of Malawi’s gross domestic product and supports 97.74% of the local population’s livelihoods (Makwinja et al. 2021b). Like other inland freshwater shallow lakes in Africa, such as Lake Chilwa, Chiuta, Kyoga, Rukwa, Baringo, Chiuta, Mweru Wa Ntipa, Naivasha, Awassa, Langano, and Victoria, Lake Malombe is thoroughly mixed, somewhat turbid, highly productive, and climate-sensitive (Makwinja et al., 2022a). The lake is near Mangochi town in the northern part (Kosamu et al., 2022; Makwinja et al. 2022c). Increased human activities, over-exploitation, climatic drivers, settlements, and unsustainable farming activities overstress the catchment, subsequently bringing nutrient load into the lake (Makwinja et al. 2021b).

Fig. 1
figure 1

Map of Lake Malombe. Note: a calibration (N = 10) and b validation (N = 19; the chl-a concentration in the east point for July was not available due to cloud coverage). The calibration curve was −  = 431.98 Cl a NDCI2 + 104 + 9.547 NDCI (R2 = 0.54). Based on the NDCI algorithms for the Lake Malombe, predicted chl-a concentrations were estimated; the RMSE for the validation datasets was 2.88 mg/m3. The dashed line represents the 1:1 line

Field and laboratory analysis of Lake Malombe water physico-chemical parameters

Ground truth data and optical measurements were collected from the lake’s inlet, western, eastern, middle, and outlet from March to October 2019. We selected the sampling stations and depths based on the continuous lake monitoring program and spanned the entire surface of the lake. The average sampling depth of the inlet was 3.8 m, eastern part 2.0 m, western part 2.0 m, middle 3.8 m, and outlet 2.8 m. The sampling points were made on the shores of the lake except those collected from the inlet, outlet, and middle. We measured the depth profiles of water temperature, pH, electrical conductivity (EC), and dissolved oxygen (DO) using the WTW MultiLine P4 multi-parameter probe model (Multi 3430). Based on these profiles, we collected the number of water samples using automatic ISCO samplers (Teledyne Isco, Lincoln, NE, USA) (one at each sampling station). We kept the water samples in 5litter UWITEC hydrophilic bottles at − 4 °C for laboratory analysis. It should be noted that before taking the measurements, we calibrated the instrument according to the manufacturer’s manual instructions. We also measured turbidity using Turbidity Meter PCE-TUM 20 (Naumenko 2008). We analyzed the soluble reactive phosphorous (SRP), nitrite, and sulfate (\({\mathrm{PO}}_{4}^{3-}\),\({\mathrm{NO}}_{2}^{-}\),\({\mathrm{SO}}_{4}^{2-}\)), using UV/Visible spectrophotometer (model T90) following molybdenum blue method in conjunction with ultraviolet spectrophotometry for \({\mathrm{NO}}_{2}^{-}\) and \({\mathrm{PO}}_{4}^{3-}\), and turbidimetric method for \({\mathrm{SO}}_{4}^{2-}\).

Chlorophyll-a in situ measurement

The chl-a in situ data were collected through multiple field samplings from March to October, 2019. The surface water samples were carefully collected using a small boat. The samples were filtered using a Whatman GF/F glass microfiber filter with a 25 mm diameter at 0.7 bar residual pressure. Before measurement, the extracts were mixed thoroughly and centrifuged for about 10 min at 500 × g (where g = gravitation acceleration (g = 9.81 m/s). A filter fluorometer with an excitation wavelength of 430 nm (10 nm bandwidth) and an emission wavelength of 680 nm (10 nm bandwidth) was used to determine chl-a. We calibrated the fluorometer using a commercial solution of pure chl-a manufactured by Sigma, UK. The concentration of the solution (in 90% acetone) was determined spectrophotometrically using an extinction coefficient of 87.67 L/g at 664 nm against a 90% acetone blank. The calibration was carried out with different chl-a concentrations covering all the linear ranges for the relationship between chl-a concentration and instrument output. Also, the maximum acid ratio was determined by measuring the fluorescence of the standard before and after acidification. The sample extracts from the centrifuge tubes were then transferred to the fluorometer cuvette by carefully pipetting and then measured against a 90% acetone blank. 0.2 ml 1% v/v hydrochloric acid was added to the corvette, appropriately mixed, left for 2–5 min, and then measured again against a 90% acetone blank. Chl-a was determined using Holm-Hansen et al.’s (1965) Eq. 1.

$$\mathrm{Chl}-\mathrm{a}\left(\upmu \frac{\mathrm{g}}{\mathrm{L}}\right)=K\times \left(\frac{{F}_{m}}{\left({F}_{m-1}\right)}\right)\times {V}_{\mathrm{e}}\times \frac{\left({F}_{0}-{F}_{a}\right)}{{V}_{f}}$$
(1)

where K = calibration coefficient = µg chl-a/ml 90% acetone per instrument fluorescence units, \({F}_{m}\)=maximum acid ratio \(\frac{{\mathrm{F}}_{0}}{{\mathrm{F}}_{\mathrm{a}}}\) of pure chl-a standard, \({F}_{0}\)=sample fluorescence before acidification, \({F}_{\mathrm{a}}\) is sample fluorescence after acidification, \({V}_{e}\) is extraction volume (ml), and \({V}_{f}\) is a filtered volume (L).

Satellite observations and chl-a reflectance

This study used atmospherically corrected Sentinel 2-MSI surface reflectance images over the Lake Malombe region (14.509S, 35.145E, 14.828S, 35.386E). Chl-a—a light-harvesting pigment in photosynthetic organisms, including algae—was used as a proxy for Lake Malombe’s primary productivity. The spectral shape algorithm was applied to derive the chl-a index and estimate algal abundance in the lake (Sharp et al. 2021). Wynne et al. (2008) applied this technique to derive the Cyanobacteria Index and estimate algal abundance in the lake (Sharp et al. 2021). Toming et al. (2016) pointed out that MSI was not designed for aquatic remote sensing because it does not have bands for the algorithm. In this scenario, we applied a modified normalized difference chlorophyll index (NDCI) with Sentinel-2 imagery to monitor chl-a, while the original NDCI used the normalized spectral band differences at 709 nm and 665 nm, as proposed by Mishra and Mishra (2012). Page et al. (2018) appealed that NDCI depicted in Eq. 2 is a well-established index developed to monitor and quantify aquatic algae distribution and concentrations (Page et al. 2018):

$$NDCI=\left(R(705))-(R(665)\right)/\left(R\left(705\right)+R\left(665\right)\right)$$
(2)

Note R(λ) in Eq. 2 represents reflectance at a given wavelength (λ) in a nanometer. The NDCI values can be converted to per-pixel chl-a concentrations. Mishra and Mishra (2012) demonstrated a strong relationship between the NDCI and chl-a concentration in turbid productive waters. Page et al. (2018) noted that using NDCI cannot lead to identifying toxic or non-toxic cyanobacteria species; instead, this index serves as a phenological assessment tool to detect algal bloom occurrences. The relationship between chl-a concentrations and NDCI can be described as a quadratic function shown in Eq. 3, which should be calibrated for each studied lake, comparing satellite-derived NDCI values with actual observations of chl-a concentrations.

$$Y=a\times {x}^{2}+b\times x+c$$
(3)

where a, b, and c are obtained from fitting.

Image sampling techniques

Google Earth Engine (GEE) has emerged as a geospatial analysis tool to handle computationally intensive tasks in cloud computing. At the same time, users can seamlessly perform planetary-scale analyses with access to rich data collections for free (Gorelick et al., 2017). This study selected GEE as a cloud computing platform. We used atmospherically corrected level-2 Sentinel-2 MSI surface reflectance images. We accessed the Sentinel-2 satellite images observed from March 28 to October 31, 2019 (30 image samples). Satellite pixels containing clouds in the quality band were masked using the QA60 band, which is a bitmask band with cloud mask information, to avoid confounding the satellite signals (Coffer et al. 2020). Other datasets had wider time gaps, so we failed to use them for calibration. To organize the information, we created Table 1 summarizing the dates of sampling and satellite overpasses, separating calibration and validation. Samples from June 24, 2019, and September 12, 2019 (10 field samples) were used to establish site-specific relationships between the NDCI index and chl-a concentrations since the satellite overpass times almost matched the sampling dates: nearly synchronous data of June 23, 2019, and September 11 (1 day). June and September are rainy, cold-windy, and hot-dry seasons, respectively. Hence, the established calibration curve represents different seasons. Due to no match-up points, exploitable satellite images were data with a time gap of 3 to 5 days (May 9, July 13, August 17, and October 16). It should be noted that few satellite images were available from November to February due to the high cloud cover; thus, these months were excluded from the analysis. We used MODIS-derived land surface temperature data, which is an average 8-day composite data in a 1200 × 1200-km grid. The central point of the lake (14.642°S, 35.257°E) and a location near the west shoreline (14.643°S, 35.163°E) were selected as monitoring points for MODIS imagery. We obtained precipitation data via Global Rainfall Map (GSMaP_MVK) by JAXA Global Rainfall Watch (produced and distributed by the Earth Observation Research Center, Japan Aerospace Exploration Agency).

Table 1 Calibration and validation datasets acquired in 2019

The double-logistic function has proven helpful in detecting and predicting temporal patterns in the land, particularly for gross primary productivity. This function is expressed as a curve defined by a specific amplitude and two inflection points. Therefore, seasonal variation and cycles in chl-a can be characterized by amplitude values and the difference between the two inflection points. This study applied a modified weighted double-logistic function to satellite-derived time series data of chl-a concentrations. The modification includes changes of a sign placed in the power of the exponential term, thereby flipping the unimodal shape of the original double-logistic function. This is necessary because, in this study, satellite data were mainly available during a season in which phytoplankton activities are weak. A double-logistic function (Eq. 4) was applied to fit the temporal chl-a concentration level pattern during 2019 in Lake Malombe:

$${U}_{\mathrm{t}}={m}_{1}+\left({m}_{1}-{m}_{2}\right)\times \left(\frac{1}{1+{\mathrm{e}}^{{m}_{3}\left(t-{m}_{4}\right)}}+\frac{1}{1+{e}^{{m}_{5}\left(t-{m}_{6}\right)}}\right)-1$$
(4)

where \({U}_{t}\) is a satellite-derived chl-a concentration at time t; m1 is the maximum value; m2 is the minimum value; m1 − m2 is the annual amplitude of \({U}_{t}\); m3 and m5 are the rate of change at the end of the season (EOS) and the start of the season (SOS), respectively; and m4 and m6 are the day of the year (DOY) associated with the EOS and SOS, respectively, which can be used for calculation of the length of the season (Zeng et al. 2020). It should be noted that EOS is followed by SOS (this order is opposite from that of the original double-logistic function). To extract these metrics, Eq. 4 is solved using the Levenberg–Marquardt least-squares algorithm, which is implemented in the non-linear least-squares optimization Python package, LMFIT (Newville et al. 2015). MSI pixels located at a 300-m buffer zone of the five sampling sites were used to calculate a median NDCI value, regarded as a representative value for each sampling site.

Model calibration

For calibration, we compared in situ data with satellite-derived NDCI values. Presenting a coefficient of determination (R2) was reasonable, as this demonstrated the degree of association of the NDCI against a chl-a concentration. However, we thought it was unnecessary to include the root mean square error (RMSE) for calibration, as this is a value calculated based on the difference between the predicted and measured values to evaluate the model’s fitness for validation. For validation, we also compared field observed chl-a data with satellite-derived chl-a. We also thought that presenting R2 is unnecessary because comparing the predicted and measured values demonstrate how close plots are to the 1:1 line. To avoid picking up a single irregular pixel coincidentally located in the coordinate matched with a sampling point, we created a buffer zone, sampled pixels from it, and took a median value representing one for each sampling site. MSI pixels located at a 300-m buffer zone of the five sampling sites were used to calculate a median NDCI value, regarded as a representative value for each sampling site. The relation between the derived NDCI and in situ chl-a data can be described as a quadratic function in Eq. 5, demonstrating satisfactory performance (Mishra and Mishra 2012; Page et al. 2018; Weber et al. 2020).

$$\mathrm{Chl}-\mathrm{a}=431.98{\mathrm{NDCI}}^{2}+104\mathrm{NDCI}+9.547$$
(5)

The coefficient of determination R2 was calculated to evaluate the degree of association of the NDCI against a chl-a concentration. The residuals for individual validation were calculated by subtracting estimated chl-a from actual chl-a, and the most accurate validation was used to prepare the chl-a spatial distribution maps (Figs. 2 and 3). The accuracy of the NDCI model was further assessed by determining RMSE of the predicted chl-a concentration (Eq. 6). The lower the number, the more precise the model (Hussien et al. 2022).

$$\mathrm{RMSE}\left(\mathrm{\mu g}/\mathrm{l}\right)=\frac{\sqrt{{\left(\sum {\mathrm{Chl}-\mathrm{a}}_{\mathrm{measured}}-{\mathrm{Chl}-\mathrm{a}}_{\mathrm{predicted}}\right)}^{2}}}{\mathrm{n}-1}$$
(6)

where n is the total number of observations of the validation set. The established NDCI algorithm was applied to all pixels corresponding to the water area, delineated using a technique proposed by Donchyts et al. (2016). Python libraries, including geemap and cartopy, were used for image productions. Images available during a specific month were amalgamated, and mean chl-a values were calculated for all the water area pixels. Lighter green colors indicated a higher chl-a concentration, while darker green colors indicated a lower chl-a concentration. A color map bar ranges from 0 to 20 µg/L. We mapped chl-a concentration distributions for 8 months, starting from March to October 2019, based on the NDCI algorithms, which were applied to every pixel corresponding to the water area of Lake Malombe.

Fig. 2
figure 2

NDCI-calibration–validation-lake-Malombe. Note: The established NDCI algorithm was applied to all of pixels corresponding to the water area, which was delineated using a technique proposed by Donchyts et al. (2016). Python libraries including geemap and cartopy were used for image productions. Images available during a specific month were composited, and mean Chl-a values were calculated for all of water area pixels. Lighter green colors indicate a higher Chl-a concentration, while darker green colors indicate a lower Chl-a concentration. A color map barrage from 0 to 20 mg/m3

Fig. 3
figure 3

Monthly Lake Malombe seasonal PP in 2019. Note: Chl-a time series of Lake Malombe covering a time period from February 28, 2019, to October 31, 2019. The solid lines indicate satellite-derived data, while circle plots do measured data

Statistical analysis

The statistical analysis, e.g., regression, correlation, and analysis of variance, was performed using PAleontological STatistics Version 4.11 and R statistical software version 4.2.1 for Windows considering significant levels (P < 0.01 or P < 0.05). Linear regression model was used to determine the influencing factors for Lake Malombe trophic status. The goodness of fit of the regression model was evaluated using the coefficient of determination (R2). The ANOVA was performed to analyze differences in water quality parameters in different seasons and among the sampling sections. Turkey’s post hoc multiple comparisons test was performed to identify the differences among the seasons and sampling sections.

Results

In situ data observation

The biogeochemical characteristics of lakes in Malawi are diverse, indicating the differences between geomorphological and climatic settings and anthropogenic activities. Table 2 presents an ANOVA analysis of water quality parameters. The chl-a of the water samples ranged from 3.51 to 12.25 µg/L over the observed period. The observed chl-a was far less than 28.4 µg/L (Tebbs et al. 2019), suggesting that Lake Malombe was not in a typical eutrophic state (Li et al. 2021). However, according to Nürnberg (1996), the lake could be classified as light eutrophic. Referring to the results of other physical–chemical parameters, dissolved oxygen (DO), sulfate (SO42−), nitrite \({(\mathrm{NO}}_{2}^{-})\), soluble reactive phosphorous \({(\mathrm{PO}}_{4}^{3-}\)), total dissolved solids (TDS), and chl-a among the sampling seasons were significantly different (P < 0.05), except for temperatures (P > 0.05). On the contrary, the ANOVA results for water quality parameters from all sampling sections had P > 0.05 (Table 3). The correlation matrix (Table 4) showed coefficients of (0.798 < r < 0.930, n = 30, P < 0.005).

Table 2 Seasonal variations of Lake Malombe limnological parameters presented as mean ± SE
Table 3 Spatial variation of water physico-chemistry and chlorophyll a in Lake Malombe
Table 4 Pearson’s correlation coefficient matrix for trends in different sampling sites

Remote sensing output

The scatter plot (Fig. 2a) showed that almost all the points were in the 95% confidence interval, suggesting that the coefficients proposed in this paper could be used on NDCI images derived from any of these sensors to predict chl-a concentration in Lake Malombe accurately. The slight deviations from the 1:1 curve could be attributed to the uncertainties in the MSI atmospheric correction and discrepancy in the actual chl-a concentrations at satellite overpass time and during the sampling paradigms. We used ten measured data for calibration since those data were obtained in the field with a time gap of 1 day compared with satellite overpasses. We recognized that the number is relatively small, but we validated the established NDCI–chl-a relationship with other independent 19 measured data. We got 2.88 µg/L as a RMSE in the validation process, which is better than that obtained in a previous study (Molkov et al., 2019), suggesting that the model’s utility was satisfactory. The color-coded chl-a distribution maps for each month were prepared and compared. The mean monthly chl-a estimates for the entire lake were computed using the NDCI algorithm and exploitable images (cloudiness lower than 20% of the lake) from March through October 2019. A regression model in Fig. 2b compares the in situ data of chl-a and satellite-derived chl-a concentrations. The NDCI calibration using the simulated data from various sensors showed a non-linear fit with measured chl-a values. However, this nonlinearity could be attributed to minor saturation observed in the model for low chl-a concentration (Mishra et al. 2014). Despite this limitation, the wavelength ranging from 610 nm to 710 nm showed a moderate correlation (R2 = 0.54) and comparable RMSE (2.88 µg/L) to that of a previous study that applied the same NDCI algorithm (Molkov et al. 2019; Li et al. 2021), indicating the capability of the NDCI in predicting chl-a concentration in Lake Malombe. Sentinel-2 MSI images were used to compute an 8-month time series data of chl-a concentrations corresponding to the five sampling sites in Lake Malombe (Fig. 4). The results indicate that the NDCI algorithm used to estimate chl-a concentrations performed well since almost all data retrieved from remote sensing are in the same range as in situ data. Figure 5 further shows the time series of satellite-derived mean chl-a concentrations for the five sampling locations during 2019 and the fitting curve, along with surface water, land temperature, and daily precipitation data. The double-logistic function model was well fitted to capture the annual pattern of the variable. The derived end and start of the season were 98, 250, and 7.0, respectively. The end of the season almost coincided with the end of a rainy season, identified from the time series of precipitation data. The end of the season did not align with the start of the rainy season; i.e., phytoplankton started to grow before precipitation, suggesting that precipitation probably did not influence phytoplankton growth. Surface water temperatures ranged from 19 to 28℃; temperatures from May to August tended to be lower compared with other months. Land surface temperatures, which started increasing at the beginning of August, peaked at 45℃ in November and dropped to around 30℃. As shown in Fig. 5, the rise of chl-a concentrations observed from July to August coincided with the increase in surface water and land temperatures.

Fig. 4
figure 4

Chl-a time series of Lake Malombe. Note: a (open circles) (top), the satellite-derived water and land surface temperature data (middle), and precipitation data (bottom), covering the entire year starting from January to December in 2019. The central point of the lake (14.642°S, 35.257°E) and a location near the west shoreline (14.643°S, 35.163°E) were selected as monitoring points for MODIS imagery. Two dashed lines indicate the EOS and SOS, respectively

Fig. 5
figure 5

The fitted double-logistic model (solid line) for chl-a time series data (black points) (upper) and the satellite-derived precipitation data (lower). Two dashed lines indicate the EOS and SOS, respectively

Discussion

Application of remote sensing algorithms in Lake Malombe

Chl-a is the best index of phytoplankton biomass for primary productivity studies (Iriarte et al., 2007). Its concentration moves vertically along the water column and varies with time and space, resulting in a sporadic spatial distribution developed in only a few hours (Agha et al. 2012). These sporadic spatial distributions are difficult to estimate using traditional sampling techniques. Thus, the application of the remote sensing approach offsets this challenge. However, remote sensing depends on field-based data, especially for calibration purposes; hence the two approaches should be considered together. This paper comprehensively examines the chl-a variations in Lake Malombe from March to October 2019. The model validation results showed good accuracy, evidenced by the RMSE (2.88 µg/L), which was smaller than the value (3.02 µg/L) obtained by Molkov et al. (2019) in their validation dataset, using the same NDCI index retrieved from Sentinel-2 imagery. In our model calibration process, we noted that the R2 value (0.54) was smaller than the one obtained by Molkov et al. (2019). This could probably be attributed to the small number of data subjected to calibration. Nevertheless, the NDCI calibration results were in line with the NDCI calibration curve presented by Mishra and Mishra (2012), indicating that the NDCI in this study predicted chl-a concentration in Lake Malombe accurately. Additionally, chl-a concentrations retrieved from MSI images were consistent with in situ data, indicating that the NDCI algorithm could estimate chl-a concentrations in Lake Malombe with acceptable accuracy. The point to note is that satellite remote sensing has some limitations. For example, MODIS on NASA’s Terra and Aqua satellites has a relatively low spatial resolution (250 nm-1000 nm), making it challenging to catch heterogeneous patterns in small lakes where water constituents are dispersed (Coffer et al. 2020). However, the most exciting thing is that the ESA launched Sentinel-2A and 2B satellites in 2015 and 2017 to offset these limitations. The MSI onboard those satellites have a temporal resolution of 5 days and spatial resolution of 10 to 60 m depending on the spectral band, which is an ideal choice for monitoring smaller lakes like Lake Malombe (Claverie et al., 2018; Soomets et al. 2020). The MSI has 13 spectral bands in the visible and near-infrared to the short-wavelength infrared domain. In this paper, the retrieved chl-a concentration followed the temporal behavior of the field data, suggesting that the satellite remote sensing technique could depict chl-a spatial and temporal heterogeneity in Lake Malombe. The findings provide options to ecosystem managers for detecting and monitoring harmful algal blooms in the lake (Buma & Lee 2020).

Spatial and temporal dynamics of Lake Malombe trophic status

Sitoki et al. (2010) pointed out that the seasonal dynamics of the lake’s primary productivity, if not monitored, can severely affect aquatic biota and humans. For example, large-scale algal blooms can result in water quality deterioration, leading to a collapse of the lake fishery. Gophen et al. (1995) pointed out that the effects of eutrophication—which include changes in phytoplankton composition, shifting from diatoms to cyanobacteria along with an increase in algal biomass—may result in fish kills in shallow water bodies and more severe deoxygenation in deeper waters such as Lake Tanganyika and Lake Malawi (Marshall et al. 2009). Several researchers have also reached a consensus that eutrophication in shallow water bodies represents a significant ecological problem and could detrimentally affect the ecosystem functioning and fish population dynamics and species composition, leading to a trophic cascade (Smith 2003; Sondergaard and Jeppesen 2007; Leira et al. 2008; Kolding et al. 2008; Makwinja et al. 2021a). In the USA, eutrophication accounts for 60% of lake impairment and water quality-related problems with severe economic implications (US EPA 1996). In Asia–Pacific, harmful algal blooms caused massive fish kills in 1994, 3164 human poisoning incidents, and 148 deaths (Carpenter et al. 1999). The monitoring efforts also cost the USA about US$ 1 million per event, with US$ 50,000 per affected area. In Lake Malombe, Njaya et al. (2011) and Makwinja et al. (2021d) evidenced the collapse of Oreochromis spp. in the 1990s, Bagrus meridionalis and Clarias spp. in 2000, large demersal haplochromine cichlids in 1986, Labeo mesops in 1996, Copadichromis virginalis in 1994, and overall, the whole lake fishery in 2016, which among other factors was attributed to profound ecological changes of the lake (Makwinja et al. 2021e).

In this study, the successive images showed the distribution of chl-a concentrations over the lake (Fig. 3), an indicator for potential eutrophication. Chl-a maps did not show a marked spatial variability suggesting that the concentration gradient in Lake Malombe is homogeneous. These observations were confirmed with the correlation matrix generated from in situ data, which also showed that a large area of Lake Malombe had similar chl-a concentration levels. Some studies also had similar observations in shallow freshwater lakes and mostly reported seasonal dynamics of chl-a. For example, Søndergaard et al. (2017) showed that temporal dynamics of P loading and chl-a concentrations in shallow lakes are more pronounced than spatial. Tšertova et al. (2011) and Li et al. (2022) also reported similar scenarios in Lake Võrtsjä and Lake Taihu, China. On the other hand, chl-a concentration in Lake Malombe shows a strong seasonal pattern. The in situ data analysis and remote sensing results confirmed that seasonal variations influenced chl-a distribution and abundance in Lake Malombe—scenarios also reported by many studies in the tropical and temperate lakes (Groover and Chrzanowski 2006; Abell et al. 2012; Cunha et al. 2013; Degefu and Schager 2015). High chl-a concentration was reported in March (rainy season) and October (hot-dry season). The high NDCI in October and March corresponded to the peak chl-a. The chl-a concentration observed in October suggested that this month is linked to high primary productivity (Makwinja et al., 2021e). Macuiane et al. (2011) made a similar observation and reported that chl-a concentration in Lake Chilwa, Malawi is consistently above 15 μg/L; however, the peak is reported in October and November, which coincides with the hot, dry season. Hecky and Kling (1981) concluded that October considered the hottest month of the year in Bujumbura, is highly linked to Anabaena and Strombdium biological mass peak in Lake Tanganyika. The chl-a concentrations were the lowest from May to August, suggesting the low productivity of the lake during these months. Zhang et al. (2010) reported the lowest chl-a concentration (mean 14.7 ± 7.1 µg/L) in winter and the highest (mean 59.3 ± 94.7 µg/L) in spring in Lake Taihu, China.

Drivers of Lake Malombe temporal chl-a dynamics

Table 3 demonstrates that chl-a concentration and other limnological parameters such as temperatures, DO, pH, SRP, SO42−, and NO2 were significantly high during the hot, dry season (September to November) and the rainy season (January to April) as compared to cold season (May to August). This variation could be linked to local climatic characteristics. For example, Fig. 6 shows a strong positive correlation between temperatures and chl-a, with adjusted R2 of 0.5, suggesting that as the lake warms up, the chl- concentration increases. Silsbe et al. (2006) pointed out that changes in the mean water-column temperature, which is more profound during the hot, dry season than in the other seasons is linked to changes in chl-a concentration in shallow lakes. Saberioon et al. (2020) cited temperature as the main driver of chl-a concentration in shallow lakes, followed by precipitation. Abell et al. (2012) also suggested that rapid growth of phytoplankton and bacterioplankton during summer increases nutrient demand and hence produces seasonal depletion, which may result in competition for nutrients among the phytoplankton taxa. This competition may lead to the dominance of high growth affinity for N or P, such as Cyclotella spp., Synedra acus, Nitzschia spp., and Chroococcus disperses (Grover et al. 1999). The typical dominance of cyanobacteria during summer has been reported in many tropical lakes over the past decades (Grover et al. 1999; Albay & Akçaalan 2003; Grover & Chrzanowski 2006; Tian et al. 2013). This study showed that the mean monthly water temperatures in September and October are more marked, which could probably lead to differences in stability and depth of the mixed layer—a key variable determining nutrient availability and primary production. This observation concurs with Willis et al. (2019), who suggested that high solar irradiation and temperatures favor cyanobacteria growth, which is highly harmful to humans. Paerl and Huisman (2008) also suggested that as temperatures exceed 20 °C, eukaryotic phytoplankton growth rate generally declines while cyanobacteria growth rates take over, providing a competitive advantage due to physiological and physical factors such as faster growth and improved stratification (Peperzak 2003). Powers et al. (2020) claimed that warming amplifies the shallow lakes’ trophic status though some lakes may diverge from this pattern. Lake surface warming expands the thermal stratification duration, preventing phytoplankton from sinking to the aphotic zone, where they are light-limited. Warming expands the growing season by alleviating light limitation, thereby increasing phytoplankton biomass. The positive regression coefficient between chl-a and lake surface temperature on interannual timescales shows such a relationship (Powers et al. 2020). The high primary production in October (hot-dry season) suggests that Lake Malombe warming strongly favors phytoplankton species such as cyanobacteria less efficiently consumed by grazers. Lake Victoria reported a similar scenario (Rigosi et al. 2014). Tian et al. (2013) also reported the peak chl-a concentration in August in Dongping Lake, China, and attributed it to high average daily temperatures. Heisler et al. (2008) cautioned that high exposure of light eutrophic water bodies to elevated temperatures and other external factors could potentially lead to an outbreak of water bloom. Tian et al. (2013) echoed Heisler et al. (2008) by suggesting that water temperature is an important environmental factor influencing phytoplankton growth in most eutrophic water bodies.

Fig. 6
figure 6

Relationship between temperature, SRP, nitrite, TDS, and chl-a

Other factors could be TDS though it did not show a strong relationship with chl-a (Fig. 6). Aranha et al. (2022) also observed that the chl-a peak in shallow lakes is recorded during low wind speed, which in Malawi falls in March, April, September, and October. Feng et al. (2015) and Zhang et al. (2021) also suggested that horizontal and spatially water movements alongside low light penetration could inhibit phytoplankton species’ growth during the cold, windy season compared to other seasons. More explanation is attributed to the excessive nutrient inputs released from the lake catchment, particularly in the rainy season—a scenario also reported by Setegn et al. (2011) in Lake Tana, Ethiopia, Lake Chivero, Zimbabwe (Robarts and Southall 1977), Lake Victoria (Hecky and Bugenyi 1996), and Funil Reservoir, Brazil (Araújo et al., 2011). Aranha et al. (2022) also claimed that a decrease in water volume during the hot-dry season could increase chl-a concentration at the water surface—a scenario depicted in semi-arid regions reservoirs. Aranha et al. further explained that low water levels and high hydraulic residential time are linked to high nutrient availability for primary production during the hot-dry season. Mugidde et al. (2003) pointed out that nutrient availability in shallow water bodies such as Lake Victoria strongly influenced primary production, affecting phytoplankton abundance and species composition. Sedwick et al. (2002) also pointed out that phytoplankton growth is strongly linked to nutrient inputs and N-P, as well as micro-quantity elements such as Fe and Mg. Tian et al. (2013) also noted that heavy rainfalls in October 2010 caused increased nutrient concentration in Dongping Lake, China, which provided sufficient nutrients for algae growth. Figure 6 further shows that nitrite and SRP significantly correlated with chl-a. Several studies have also demonstrated similar connections in many water bodies across the globe. For example, in Lake Victoria, the dominance of blue-green algae—a precursor for the production of cyanobacterial toxicity and algal blooms—were linked to increased nutrient loading into the lake during the rainy season (Bootsma and Hecky 1993). Macuiane et al. (2011) had a similar observation in the Lake Chilwa basin, Malawi, and attributed it to nutrient inputs during the rainy season. In Lake Naivasha, Kenya, increased chl-a concentration in the lake was significantly linked to increased soluble reactive phosphorus instigated by the heavy rainy season between October and November 1997. Villalobos et al. (2003) also demonstrated a strong relationship (r = 0.89, P < 0.05) between SRP concentration and chl-a in different North Patagonian lakes. Filstrup and Downing (2017) pointed out that chl-a concentration increases with an increase in nitrite until it reaches a threshold of 3 mg/L and decreases thereafter, resulting in a high nutrient concentration in the lake.

Conclusion

In this study, the trophic state of the inland water body was assessed using a remote sensing algorithm and field surveys. The result indicates that integrating satellite remote sensing and field data can provide opportunities to monitor the inland water bodies trophic status with high spatial and temporal variability. This information helps understand the shallow lakes’ trophic status dynamics and provides a basis for developing quantitative and cost-effective water bodies’ monitoring programs. The results demonstrate that remote sensing can be a sensitive tool for monitoring the lake’s trophic status temporal and spatial variations. The lake trophic status forms an essential linkage with fisheries production; hence this study provides valuable information for fisheries management concerning spatial and temporal changes in the lake. Considering the high spatial resolution of MSI-images, we suggest that remote sensing can be used to identify high spatial variability in inland freshwater shallow lakes, which could not be captured by field monitoring. Thus, satellite data with appropriate calibration and validation is essential to comprehensively understand the trend and pattern of events such as algal blooms in inland water bodies such as Lake Malombe. This study suggests that monitoring Lake Malombe trophic status using integrated remote sensing satellite images and field-based data provides a reliable monitoring tool for the policy and decision-makers for sustainable management purposes.