Keywords

1 Introduction

Agricultural production undergoes increasing pressure from anthropogenically-induced and natural changes, including rising population, conversion of food (cereals) into biofuels, increased protein demands and climatic extremes [1]. Through a fleet of Earth Οbservation (EO) satellites, National and International space agencies are determined to keep their fingers on the pulse of agricultural land and crop growth [2]. Among the objectives of the multiple EO satellite missions launched in the last five decades, primary importance has been given to observe agricultural and natural vegetation land covers [3,4,5,6,7]. The strong correlation between the response of vegetation in the visible and near-infrared spectrum and its biophysical activities led the preference towards optical sensors for crop growth monitoring [8].

Optical data from EO image time series at high temporal resolution can effectively assist in vegetation monitoring over time as they provide key information about vegetation status over large areas. However, imagery acquired at a high temporal resolution goes traditionally at the expense of a low spatial resolution, and EO missions dedicated to time series studies have long been restricted to the domain of wide swath that achieve global coverage on a near daily basis. For instance, the Advanced Very High-Resolution Radiometer (AVHRR) was pioneering in time series studies for vegetation monitoring studies at regional to global scales for more than 25 years. AVHRR has been collecting a near-daily global coverage of coarse-to-moderate spatial resolution (1 km and 8 km) providing a consistent time-series of temporally-composited observations [9,10,11]. As a marked improvement, the Moderate Resolution Imaging Spectrometer (MODIS) has provided, since the early 2000s, an improved times-series of multispectral observations, acquiring a global coverage of multispectral imagery with a high temporal (daily) resolution, a higher spatial resolution (250–500 m) and seven land-related spectral bands for vegetation detection. MODIS data have become increasingly used for vegetation growth monitoring over large geographic regions [12, 13]. Yet, probably the most noteworthy pioneering mission for land applications is the Landsat series of satellite-based sensors. Landsat has long been appropriate for many landscape characterization applications such as land cover classification, change detection and vegetation monitoring. It has a nominal 16-day temporal resolution and up to 30 m spatial resolution, with a data archive extending from the early 1970s to present. However, the usage of Landsat time series for crop growth monitoring has limitations because vegetation changes may occur more rapidly than the 16-day revisit time of Landsat. In addition, cloud cover contamination of the optical satellite observations further reduces the number of Landsat images available to adequately detect many seasonal events [8].

These pioneering monitoring missions paved the path for a diversity of dedicated EO land missions initiated by National and International space agencies with emphasis in exploiting the spatial, spectral, or temporal domain. With current and upcoming EO satellite missions, an ever-increasing amount of optical EO satellites are orbiting around the Earth, such as the Sentinel constellations on behalf of the joint ESA/European Commission initiative Copernicus and the NASA A-Train satellite constellations. With the operational super-spectral Copernicus’ Sentinel-2 (S2) [14] and Sentinel-3 missions [15], as well as the recently launched and upcoming imaging spectrometer missions [16,17,18,19], an unprecedented data stream for vegetation mapping and monitoring becomes available. For instance, the unprecedented frequency of S2 multispectral observations (every five days) with a spatial resolution of 20 m (up to 10 m for specific bands) captures rapid changes of agricultural land-cover from national to field scale, serving as a major support for environmental monitoring and agricultural subsidy control [14]. Hence, S2 time series allows for high-resolution coverage of large areas with systematic data acquisition with high-frequency sampling during critical phases of the crop growth cycle [20]. The Sentinel-3 satellites even enable a short revisit time of less than two days for the optical sensor OLCI (Ocean and Land Colour Instrument), but it is a medium-resolution imaging spectrometer as it provides a spatial resolution of 300 m [15], and thus is less suited for crop monitoring at field scale.

Having an unprecedented influx of optical time series data at disposal, an essential condition for using image data for further processing is that it requires to be spatially and temporally continuous, i.e., gap-free data. Unfortunately, in reality this need is often unfulfilled, due to multiple causes: (1) inadequate climatic conditions (clouds, snow, dust and aerosols), (2) instrumentation errors, (3) losses of data during data transmission or (4) low temporal resolution (i.e., long time needed to revisit and acquire data for the exact same location), among others. The causes above degrade the availability of spatial and temporal information required to retrieve land surface properties. Therefore, the impact of missing data on quantitative research can be serious, leading to biased estimates of parameters, loss of information, decreased statistical power, increased standard errors, and weakened findings [21]. For this reason, spatiotemporal reconstruction of gapped areas from satellite imagery is becoming crucial for monitoring purposes [22], including the knowledge of the life cycle of vegetation, i.e., vegetation phenology [23].

Another important remark is that, from an EO perspective, specific plant seasonal events such as budbreak, leaf out, land leaf senescence, flowering and maturity of cereal crops cannot be directly detected at the spatial resolution of satellite imagery. Instead, more general descriptors of vegetation dynamics termed ‘land surface phenology (LSP)’ are calculated [8]. LSP refers to the seasonal pattern of variation in vegetated land surfaces observed from remote sensing [24]. This is distinct from observations of individual plants or species, as space-based observations aggregate information on the timing of heterogeneous vegetation development over pixel-sized areas. This aggregation often disassociates the response signal of the landscape from that of the individual species; yet is important for representing landscape scale processes in biosphere atmosphere interaction and crop monitoring models [24]. LSP metrics are typically associated with general inter-annual vegetation changes interpretable from spectral remote sensing imagery such as start of greening/season (SOS), the peak of growing season, onset of senescence or end of the season (EOS), and growing season length [24, 25], as well as other transition stages (e.g., maturity and senescence) [5]. Therefore, this chapter provides an overview of the possibilities for calculation of these LSP metrics from time series images for crop monitoring purposes.

Altogether, when aiming to process time series data for calculation of LSP metrics and agricultural monitoring purposes, a critical aspect to deal with is that EO data is spatially and temporally discontinuous. This implies that the ability to process irregular time series becomes indispensable for studying seasonal vegetation patterns. In this respect, this chapter aims to provide a general overview on agricultural land monitoring by means of EO image time series analysis and subsequent LSP calculation. To do so, first an historical overview of EO satellites with optical sensors that are designed to monitor the phenology of agricultural lands is given. Second, solutions are offered on how to gap-fill time series image data and then to calculate LSP metrics. Third, the calculation of LSP metrics from MODIS and 1 km aggregated S2 data is presented for two demonstration areas characterized by different dominant crop: corn and winter wheat. Finally, trends in EO missions and image time series processing are being discussed in the broader context of monitoring croplands’ phenology.

2 Satellite Sensors for Crop Phenology Monitoring

Although in the current era of EO missions time series processing has become standard practice in agriculture monitoring, it only recently reached maturity. Almost half a century was dedicated to overcoming challenges related to EO technology and optimizing for ideal temporal and spatial resolution. In this respect, this section intends to give a brief historical overview about EO satellite missions for agricultural monitoring purposes. Afterwards, time series data from NASA and ESA flagship missions for land applications are used for presenting crop monitoring demonstration cases.

When EO satellites were first available in the 60s, it was recognized that the technology held considerable promise for agricultural monitoring [26]. NASA was pioneering with EO programs for agricultural monitoring purposes. Initial efforts involved the NASA LACIE and AgriSTARS programs in the 70s. They made significant advances in crop monitoring but were seriously constrained by satellite data availability. At the beginning of EO missions, satellite optical data have been primarily provided globally at coarse-resolution (c. 250 m–8 km) by systems specifically developed for land applications. This is especially true for the AVHRR sensors, launched back in the 80s. AVHRRs provided daily global observations, which represent one of the most critical features needed for agriculture monitoring, but they were limited by their low spatial resolution (1 km). It has long been recognized that when working on agriculture applications, a good temporal resolution is required, given that the crop phenology and conditions (e.g., water supply, pests, environmental) can change very quickly. To this end, the NASA Long Term Data Record (LTDR) contains gridded daily surface reflectance and brightness temperatures derived from processing of the data acquired by the AVHRR sensors onboard four NOAA polar-orbiting satellites: NOAA-7, -9, -11 and -14. The Version 4 contains improvements to geolocation, cloud masking and calibration, making the data record suitable for crop monitoring [27]. This product is still operational, and its usefulness has been demonstrated for a wide variety of applications such as snow cover estimation [28], agricultural modeling [27], Leaf Area Index (LAI) and Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) retrieval [29, 30], global vegetation monitoring [31, 32], burned area mapping [33] and albedo estimation products [34].

A next milestone involved the MODIS sensor on the Terra satellite. Since its launch in 2000, observations from the EOS/MODIS sensors have several of the key qualities needed for global agriculture monitoring such as global, daily coverage at coarse spatial resolution (250 m) and a suite of validated products. With MODIS onboard Terra (morning satellite) and Aqua (afternoon satellite) getting to the end of its operational life, it was high time to transition into new satellites. The Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi National Polar-orbiting Partnership (S-NPP) satellite provided continuity with MODIS from 2012 [35, 36]. It overpasses once a day and during the afternoon, which decreases the chance of getting cloud-free observations, especially in the tropical regions. However, the combination with the ESA Sentinel-3 satellite, that overpasses during the morning and has similar characteristics to MODIS, provides continuity to the successful and still valuable (due to its high temporal resolution) coarse resolution missions. Despite the advantage of the high revisit time, the main disadvantage of coarse to moderate resolution sensors is the spatial resolution that often mixes, in a given pixel, signals from different land cover types and crops. Stratifying a region into different crop types (commonly termed as crop masking) is an important step in developing EO-based agriculture models [37]. Such masks enable the isolation of the remotely sensed, crop-specific signal throughout the growing season, reducing the noise on the signal from other land cover or crop types [38]. In the United States (US), the US Department of Agriculture (USDA) generates a yearly national Cropland Data Layer (CDL) since 2007 [39] and Canada provides yearly national Annual Crop Inventory Maps (ACIM) since 2009 [40]. These masks are provided at the end of the growing season and no crop type masks are available for other countries. Therefore, generally EO-based agriculture models use static cropland or crop type masks.

The flagship moderate resolution NASA mission Landsat, with data going back to the 70s, was long used for agriculture monitoring, but with limitations mainly due to its low temporal resolution of 16 days. Coupled with the frequency of cloud cover, the revisit time for some regions is often worse. A number of studies have fused Landsat with MODIS data [41,42,43], and combined Landsat data with biophysical models [44, 45], leading to varying results in terms of errors as they are still constrained by the low temporal frequency of Landsat imagery. The launch of the ESA optical moderate resolution missions Sentinel-2A in 2015 and Sentinel-2B in 2017 have been revolutionary for the moderate agriculture monitoring. The increased temporal coverage and the new technologies offered by the Sentinel systems and their combination with NASA sensors, provides new opportunities for high temporal frequency moderate resolution remote sensing, enabling a new generation of agriculture products to be generated. Specifically, with the Sentinel-2A and -2B fusion with Landsat, it is now possible to achieve a temporal resolution of three to five days globally. In fact, recent studies leverage the combination of these satellites to address crop yield assessment at field scale [46,47,48]. Yet, simply having synergistic sensors on orbit is not sufficient for end users; the data products themselves must also be processed in such a way as to ease preprocessing and analysis burden. The Harmonized Landsat/Sentinel-2 (HLS) project [49] developed by NASA provides a surface reflectance product that combines observations from USGS/NASA’s Landsat-8 (LS8) and ESA’s Sentinel-2 (S2) satellites at moderate spatial resolution (30 m). The main goal is to provide a unique dataset based on both satellites’ data to improve the revisit time to three to five days depending on the latitude. Along with a common atmospheric correction algorithm [50], geometric resampling to 30 m spatial resolution and geographic registration [49], the product is also corrected for Bidirectional Reflectance Distribution Function (BRDF) effects and band pass adjustment. Besides, the Sen2like tool [51] developed by ESA will provide analysis ready Harmonized LS8 and S2 data/products to the user. Using the S2 tiling system, the sen2like tool processes S2 Level-1 products and LS8 Level-1 products and create a harmonized surface reflectance data stack at 10 m spatial resolution. Working on the same baseline principles as NASA HLS initiative, geometric, radiometric and image processing algorithms are applied. Recent studies took advantage of Landsat and S2 data to address crop yield assessment at a moderate spatial resolution [52, 53].

Recent advances in data acquisition and processing (e.g., cloud computing) are making possible the development of global high-to-moderate resolution data sets (10–30 m). Such global time series data will permit improved mapping of crop type, crop area and vegetation properties essential for regional implementation of monitoring strategies. Higher temporal frequency from multiple high-to-moderate resolution satellites will also provide a better characterization of agronomic growth stages, with the consequent improvement of crop production modeling accuracy.

3 Time Series Processing for Crop Seasonality Monitoring

3.1 Gap-Filling

An essential step for being able to use EO data for further processing such as LSP calculation, is converting raw data time series into spatio-temporal continuous datasets. To ensure this, gaps mostly provoked by clouds must be filled. Time series gap filling essentially refers to the prediction of missing values in time. Mostly, these missing values are located within the dataset time series, so in principle interpolation methods to fill them up would suffice. It is therefore no surprise that interpolations and fitting methods are commonly used as a first step in the time series processing. According to the recent review by [8], gap filling methods can be categorized into: (1) smoothing and empirical methods, (2) data transformations, and (3) curve fitting methods. From these three categories, the curve fitting methods are the most commonly used, with double logistic curves being a popular method for seasonality estimation [54,55,56]. This family of methods has expanded rapidly in the last few years with the emergence of adaptive machine learning regression algorithms [57]. See also [57, 58] for a quantitative evaluation of these methods. Some machine learning methods proved to be particularly attractive; not only because of achieving higher accuracies when validated against a reference image, but also because of additional properties such as delivering uncertainty estimates (e.g. Gaussian processes regression: GPR). Most of these methods have been recently implemented into an in-house developed graphical user interface (GUI) toolbox, named DATimeS (Decomposition and Analysis of Time Series software) [57]. DATimeS has been developed to generate cloud-free composite maps from regular or irregular satellite time series. The novelty of the toolbox lies in expanding established time series gap-filling methods with a diversity of advanced machine learning fitting algorithms. An overview of the gap-filling methods is provided in Table 1.

Table 1 Interpolation methods used for gap-filling

Here, a brief description of the toolbox is provided, as it will be used in subsequent calculation of phenology indicators. In short, DATimeS is developed as a modular toolbox that can be applied to both, set of images and discrete time series data stored in a text file. An overview of the DATimeS’ modules contained in this first version (v.1.06) is shown in Fig. 1. The core machinery of DATimeS is the “Time Series Analysis” module, where the gap-filling methods can be selected, and subsequent phenology indicators can be computed. The user may choose whether to incorporate the smoothing function prior to the parameter estimation. Although a prior smoothing step may help in finding general patterns, it must be remarked that most fitting methods perform a time series anyhow, smoothing along with the fitting prediction. Before starting the gap-filling procedure, a compulsory step is to define the output time settings, i.e., the days to which cloud-free interpolated images will be generated (e.g., every 10 days). These composite products from operational land missions (e.g., AVHRR, MODIS, SPOT Vegetation (VGT)) are commonly used for subsequent LSP calculation [25, 59,60,61].

Fig. 1
figure 1

Hierarchical design of DATimeS

3.2 LSP Calculation

A next step involves the calculation of the phenology indicators from the prepared cloud-free time series data, i.e., the LSP metrics. Numerous studies have dealt with the retrieval of phenological phases from remotely sensed data [55, 62,63,64,65]. LSP metrics quantification over croplands is widely used for yield estimation, or to improve management and timing of field works (planting, fertilizing, irrigating, crop protection or harvesting) [66, 67]. Distinct LSP metrics may be of interest to the scientific community, private companies and farmers, such as dates of start and end of the growing season (SOS and EOS, respectively), maximum peak, seasonal amplitude defined between the base level and the maximum value for each individual season, length of the season, etc. [38, 68]. These LSP metrics are extremely sensitive to changes in vegetation cycles related to multiple factors such as climate anomalies or extreme weather events, which can have a profound impact in the agricultural production [69,70,71]. Hence, estimating LSP metrics is a convenient way to summarize seasonal information in a few comprehensive quantitative descriptors. However, it must be taken into account that these metrics are sensitive to the processing data characteristics or methods used (e.g., gap-filling method, pixel size, time period of the time series). Therefore, outputs must be carefully analyzed (see also review in [8]), as will be further demonstrated in the case study.

In practice, LSP metrics are recommended to be derived after the interpolation step so that cloud-free composite images are created, and trends become evident for easy phenological metrics derivation. For this reason, DATimeS recommends LSP estimation as the next logical processing module after the gap-filling module, even if going directly to this step is also possible. In this module, the whole time series is first analyzed looking for possible multiple growing seasons.

Then, each individual season is processed separately to estimate the phenological indicators (e.g., SOS and EOS) based on conventional threshold methods, analogous to [68, 72,73,74,75]. The computational routine for LSP calculation follows multiple steps. It runs pixelwise, and for each pixel it: (1) extracts the time series, (2) identifies automatically individual growing seasons within each year, (3) locates specific points within the growing season (e.g., SOS, EOS, peak), (4) computes seasonal integrals (area under the curve between SOS and EOS) and (5) stores the estimates in output ENVI or Tiff files. Three alternative methods have been implemented to calculate the SOS/EOS: (1) seasonal, (2) relative and (3) absolute amplitude. In the former case, the SOS/EOS are identified where the left/right part of the curve reaches a fraction of the seasonal maximum amplitude along the rising/decaying part of the curve. The second approach is similar to the previous one, but now a mean amplitude is estimated considering the minimum/maximum values of all seasons. Consequently, the SOS/EOS correspond to dates where the curve reaches a specific percentage of the reference amplitude. In the latter method, the SOS/EOS are determined when each growing season reaches the same fixed value.

4 Demonstration Cases Time Series Processing

Having outlined the main principles of (1) EO missions dedicated to crop monitoring, (2) gap-filling methods, and (3) LSP calculation, this section provides some time series demonstration cases with temporal data coming from the currently most successful optical missions at low and high spatial resolution, i.e. MODIS and S2 acquisitions. The study focuses on the trade-off between revisit time and spatial resolution of each sensor and is carried out over two agricultural landscapes of the US, each one characterized by the presence of a different dominant crop type with specific phenological dynamics: winter wheat and corn.

4.1 Study Area and Data Acquisition

The US is one of the main producers and exporters of corn and wheat globally. In 2016 the US was the leading wheat exporting country, shipping 14.8% of global wheat exportsFootnote 1. Wheat is produced in almost every state in the United States and winter wheat varieties dominate US production, representing between 70% and 80% of the total wheat production. The winter wheat is planted in the fall and harvested during June-July. Generally, wheat is rain-fed and just 7% of the national production is irrigated. The main wheat class is Hard Red Winter Wheat, which is grown primarily in the Great Plains, with Kansas being the largest producing state. Besides, the US is a major player in the world corn trade market, with between 10% and 20% of its corn crop exported to other countries. Corn is grown in most U.S. States, but production is concentrated in the Heartland region (including Illinois, Iowa, Indiana, eastern portions of South Dakota and Nebraska, western Kentucky and Ohio, and the northern two-thirds of Missouri). Iowa and Illinois, the top corn-producing States, typically account for about one-third of the U.S. crop. The corn is planted during April-June and is harvested during September-November.

With the aim of performing a fair comparison of multispectral spatiotemporal information carried by high- and low spatial resolution multispectral imageries, i.e. S2 and MODIS, over corn and winter wheat, the Crop Data Layer (CDL) yearly produced by the National Agricultural Statistics Service (NASS) of the US Department of Agriculture (USDA) was analyzed, and selected two S2 tiles representative of each crop type were selected.

4.1.1 Crop Data Layer

The Crop Data Layer (CDL) is distributed by NASS since 2008 at 30 m as part of the official archive of county-level statistics on yield, area harvested and production that are available from the USDA National Agricultural Statistics Service (NASS) Quick Stats databaseFootnote 2. It is a rasterized land cover map using field level training data from extensive ground surveys, farmer reports provided to the US Farm Service Agency (FSA), and remotely sensed data from Landsat Thematic Mapper (TM), Landsat Enhanced Thematic Mapper (ETM+) and Advanced Wide Field Sensor(AWiFS). These data are used in a decision tree classifier in order to produce a land cover classification that distinguishes between different crop types, including winter wheat [39, 76].

4.1.2 MODIS and Sentinel-2 Surface Reflectance Time-Series

The time span chosen for the study was the year 2019 due to the availability in Google Earth Engine (GEE) [77] of S2 surface reflectance images over the US from December 2018 on. The first tile chosen is 11TLM, which is located in North West of US, in the South of Washington and contains mainly winter wheat cultivated areas. The second one is 15TVH, centered in North Iowa, with essentially corn and soybean crops. S2 data were downloaded from GEE in UTM projective coordinates. Limited by the spatial resolution of CDL, S2 information distributed at 20 m were gathered. Aside from the crop-of-interest spatial density, a second criterion for the selection of the two areas was their medium frequency level of cloudiness estimated by analyzing MODIS daily cloud mask. This way, the main advantages and drawbacks of the shorter revisit time of coarse resolution MODIS imagery against the longer revisit time of high resolution S2 acquisitions can be assessed. Details about the spatial properties of the test sites are summarized in Fig. 2. The nine classes in the legend correspond to the most frequent classes within the two test-sites, among the 134 provided by USDA [78]. Tile 15TVH is essentially made up of two main classes, corn and soybeans; tile 11TLM presents a more heterogeneous scenario, with winter wheat being the dominant crop class after pasture. The landscape is further characterized by shrubland, but crops as spring wheat, alfalfa and potatoes are also cultivated. Grey and blue colors indicate urban and water areas, respectively.

Fig. 2
figure 2

USDA land cover map of 2019 over S2 footprint @30m for tiles 11TLM (left) of Washington and 15TVH (right) of Iowa. The legend details the main classes within the two tiles, among the 134 defined by USDA. Green lines and greyish areas define US Counties and States limits, respectively. The cloudiness map along 2019 was estimated as percentage of per pixel MODIS cloudy acquisitions

The analysis of MODIS time series was based on MODIS daily surface reflectance Collection 6 data (MOYD09GQ) distributed by the Land Processes Distributed Active Archive CenterFootnote 3 (LPDAAC), which are gridded in the sinusoidal projection at 250m resolution. Additionally, the product MOYD0 was used 9GA to extract the geometry of observation illumination of each image. Since the nominal 250 m MODIS resolution decreases for the off-nadir observations and due to inaccurate registration [79], the 250 m surface reflectance was re-scaled to 1 km spatial resolution to mitigate that effect by aggregating 4 × 4 pixels. The wide swath MODIS sensor allows for near global coverage of the Earth every day. However, it has a 16-day repeat cycle, which means that every day the geometry of observation is different and can include View Zenith Angles (v) of up to 65 degrees. As a consequence, the surface reflectance that is defined for a given geometry of observation-illumination has different values every day. In order to normalize the BRDF effects on the surface reflectance, we used the VJB method [80, 81]. This method uses longer compositing periods (five years in [80]), than the MCD43 product (16 days) [82], which reduces the noise in the normalized reflectance time series [83]. In this study, the nadir BRDF parameters at 1 km spatial resolution using the most recent five years (2012–2016) were derived.

By using the daily surface reflectance (from both Aqua and Terra) and its angular conditions during the five-year period considered, the variables that define the BRDF shape (V and R in Equation (1)) are derived using the approach proposed by [81]. The MODIS dataset consists of daily acquisitions covering the whole US territory during 2018 and 2019. For S2, a total amount of 237 and 158 partially cloudy or cloud-free images was collected for 11TLM and 15TVH, respectively. Despite the nominal revisit time of S2 being five days, 11TLM is fully covered by orbit 113 and partially covered by orbits 70 and 13, whereas 15TVH is fully covered by orbit 69 and partially covered by only orbit 112. This explains the different number S2 images. Details about the dataset are reported in Table 2.

Table 2 MODIS Terra/Aqua and S2A/B imagery information over the two test-sites

4.2 Time Series Processing Over Croplands

As pointed out in Sect. 2, the main disadvantage of coarse to moderate resolution sensors is the spectral mixing from different land cover types and crops. In order to characterize the degree of homogeneity of coarse resolution pixels, the higher spatial resolution information provided by CDL map and S2 imagery can be exploited. First, a common coordinates’ reference must be defined to allow establishing a pixel-to-pixel correspondence among the different information sources. MODIS data were cropped over S2 tiles 11TLM and 15TVH, projected to their corresponding UTM reference at 20 m using the nearest-neighbor interpolation, and finally aggregated at 1 km. Similarly, CDL maps were projected onto S2 UTM reference at 20 m.

The interpolated CDL was then used to calculate the percentage of each land cover class within each MODIS 1 km pixel. A qualitative description of the homogeneity of MODIS pixels is given in the 1 km land cover maps shown in images (a) and (b) of Fig. 3. For their generation, the 3 most likely classes at pixel level were taken into account. Denoting them ordered by probability as Cl1, Cl2 and Cl3, the RGB composite was obtained by weighing the color coding of the three classes with the corresponding percentages. The visual comparison of the land cover maps at 20 m (Fig. 2) and 1 km indicates that a dominant class can be still identified at MODIS scale. Yet, the less saturated colors point out the presence of a non-negligible class mixing. The higher the mixing, the more relevant the difference between MODIS and S2 spectra is. A quantitative estimation of this mixing effect is provided by the Gini-Simpson Index (GSI) [84, 85]. The GSI essentially quantifies how many different types of classes the pixel of interest contains, and is computed as the complement of the sum of squared N-member fractions of classes:

$$ GSI\left(x,y\right)=\sum \limits_{i=1}^{N_c}1-{p}_i{\left(x,y\right)}^2 $$
(1)

where (x, y) denotes the coordinates of MODIS pixel’s center, Nc is the total number of CDL classes within the pixel and pi is the fraction of the area covered by the ith class. The result obtained for the area corresponding to the two S2 tiles are shown in Fig. 3c, d. The closer GSI to zero, the purer the pixel is, i.e. a dominant class characterizes the pixel. Conversely, a higher GSI denotes a heterogeneous 1 km pixel where multiple classes are present with comparable percentages. Over the latter ones, the interpretation of coarse resolution imagery deserves special attention, as the information they contain cannot be transferred directly to individual classes. To clarify this concept, we use the Difference Vegetation Index (DVI).

Fig. 3
figure 3

Synthetic Land cover map @1km of 11TLM (a) and 15TVH (b) tiles based on USDA land cover product @30m weighted by the probability of classes Cl1, Cl2 and Cl3 within MODIS pixels. In (c) and (d) the corresponding GSI maps are showcased

DVI is a non-normalized parameter simply defined as the difference between the near-infrared and the red bands, with the main advantage to describe the evolution in time of crop phenology avoiding saturation effects often detected with other normalized indexes such as NDVI [86]. Accordingly, five DVI time series at 1 km from the two imagery sources were generated. The first two ones are the MODIS DVI from the BDRF-corrected MODIS, and the S2 DVI obtained by simply upscaling the S2 product to 1 km. Besides, for each 1 km pixel the three classes with the highest probability were selected and the S2 DVI value of 20 m pixels belonging to each of them separately was averaged. The corresponding DVI at 1 km for pixel (x,y) at time t was hence obtained as follows:

$$ DV{I}_{CLk}\left(x,y,t\right)=\frac{1}{N_{pClk}}\sum \limits_{i=1}^{N_{pClk}} DV{I}_{S{2}_{20m}}\left({x}_i,{y}_i,t\right) $$
(2)

where NpClk indicates the number of 20 m pixels (xi, yi) within the 1 km pixel centered in (x, y) and belonging to the class Clk, with k = 1, 2, 3. Examples of the five-time series obtained for almost pure and heterogeneous pixels of corn and winter wheat are shown in Figs. 4 and 5, respectively.

Fig. 4
figure 4

DVI Time series over almost pure corn pixel (a), more abundant but not dominant corn pixel (b, c) @1km from MODIS, S2 and S2-based pure classes (mean value ± 1 standard deviation with the 1 km pixel)

Fig. 5
figure 5

DVI Time series over almost pure winter wheat pixel (a), more abundant but not dominant Winter Wheat pixel (b) @1km from MODIS, S2 and S2-based pure classes (mean value ± 1 standard deviation with the 1km pixel)

In general, an analogous temporal evolution of DVI from MODIS (blue triangle) and upscaled S2 (magenta circle) images for the two crop types can be observed on both homogeneous and heterogeneous pixels, confirming both the effectiveness of the BDRF correction and the accuracy of the datasets spatial alignment. In terms of time sampling, the lower sampling rate of S2 does not seem to affect the reconstruction of the overall shape of vegetation dynamics significantly. Yet, quantitative assessments of phenology descriptors are required to estimate the real effect on vegetation characterization. As expected, over pixels characterized by GSI close to zero the coarse resolution imagery mimics faithfully the evolution of the dominant classes (blue asterisks). There, the 1 km information can be used directly to infer crop properties, being spurious contributions from the rest of classes negligible. On the contrary, pixels characterized by higher degrees of heterogeneity are not able to provide a direct description of the crop-type of interest, being the information drifted apart from the pure time series as far as it becomes less dominant within the pixels. For them, unmixing approaches are mandatory if reliable vegetation evolution is to be retrieved, and if only coarse resolution imagery is available the solution comes with accepting an additional loss of spatial details. A successful solution has been put forward in [87], where spectral unmixing is carried out with an Ordinary Least Square method at US County level and provides a unique crop-type time series at US county level. Overall, this hypothesis is fully satisfied for irrigation crops such as corn, and the county-level characterization is also representative of crop behaviors at 1 km. This can be observed in the normalized 2D histograms of DVI time series at 1 km for the tile 15TVH, shown in Fig. 6.

Fig. 6
figure 6

Normalized 2D histograms of corn (tile: 15TVH) and winter wheat (tile: 11TLM) DVI time series at 1 km grouped by County

The corn region time series at 1 km was obtained by averaging at 1 km scale only S2 pixels labeled as corn in the USDA land cover map. A minimum crop-type percentage threshold of 20% was also applied to filter out noisy information. The results over the four counties entirely covered by the tile 15TVH of Iowa (Hancock, Cerro Gordo, Franklin, and Wright) show that minimum differences are detectable in the temporal evolution of the DVI, being the time sample dispersion slightly higher just during the start and end of season. Because of the corn dominance, smooth temporal profiles with a clear phenology can be detected.

Conversely, when applying the same analysis to winter wheat fields of tile 11TLM, a significant spreading of time series during the whole evolution of the crop in the two counties of Washington (Frankin and Walla Walla) can be observed. Whereas the bare soil period before seeding and after harvest are stable overall the tile, the magnitude of the phenological evolution of this crop type turns out to be dependent on the specific 1 km pixel selected for the analysis. The larger spread suggests a more heterogeneous land cover with variations in phenology due to different crop types and natural vegetation. These two contrasting land covers show the case for an in depth systematic and quantitative analysis, i.e., as done by the LSP calculation.

4.3 LSP Calculation Over Croplands

The two test cases presented in the section above (e.g. see Fig. 6) have been processed by DATimeS in order to estimate the LSP metrics. To do so, first gaps due to cloud cover were filled by means of a machine learning (ML) fitting method over the temporal data. The ML algorithm Gaussian processes regression (GPR) was chosen because of excellent fitting performances (see [57, 58] for a quantitative analysis of over 20 gap-filling algorithms). As such, cloud-free DVI maps were reconstructed on a five-days basis for the year 2019. Subsequently, the LSP metrics can be reliably calculated.

The LSP metrics were calculated for the following three time series products:

  • S2Cl1: Sentinel-2 data at 1 km obtained by averaging only pixels labeled as the dominant crop at MODIS scale, according to the CDL map. Thus, this represents the time series of pure dominant crop within each MODIS pixel;

  • S2: Sentinel-2 data at 1 km resolution;

  • MODIS: MODIS data at the nominal 1 km resolution.

Starting with the homogeneous corn fields dataset, general LSP results are reported in Table 3. The mean values for all the pixels are provided, as well as the associated standard deviation (SD). Considering the pure corn crop S2CL1 as reference, it can be noticed that the S2 and MODIS data at 1km provide similar statistics, with especially the S2 product providing analogous values as the S2CL1. The consistency can be explained by the dominance of corn fields in the S2 tile. The consistency of the LSP metrics among the three time series products can probably be better expressed by calculating the mean absolute deviation (MAD) and its dispersion, as displayed in Table 4. Differences are low, especially when comparing the S2CL1 against the S2 product, meaning that for this more homogeneous region both S2 and MODIS datasets provide consistent temporal information.

Table 3 Mean and standard deviation (SD) of phenological indicators estimated from S2, S2Cl1 and MODIS over 1km-pixel whose Cl1 corresponded to corn (tile 15TVH). SOS, EOS and Day MV are in DOY 2019
Table 4 Mean absolute deviation (MAD) and standard deviation (MSD) of MAD estimated from S2, S2Cl1 and MODIS over 1km-pixel whose Cl1 corresponded to corn (tile 15TVH). SOS, EOS and Day MV are in DOY 2019

When repeating the same exercise for the more heterogeneous landscape with winter wheat as dominant crop (Table 5), it becomes apparent that the consistency among the S2 and MODIS information somewhat degrades. This especially holds for EOS and consequent LOS with more than a month difference. On the other hand, the MV, Amp and day MV seem more robust, suggesting that the mismatch took only place in identifying the EOS. In general, the S2 dataset resembles closer the S2Cl1 dataset, as is also quantified by the Area between SOS and EOS. The differences between S2 and MODIS are also revealed by calculating the mean absolute deviation against S2Cl1 (Table 6); the differences with MODIS are up to two twice as large as compared to S2. Altogether, it suggests that the MODIS dataset is harder to interpret in view of the phenology of the dominant crop, winter wheat, due to the larger heterogeneity in croplands and patches of natural vegetation.

Table 5 Mean and standard deviation (SD) of phenological indicators estimated from S2, S2Cl1 and MODIS over 1km-pixel whose Cl1 corresponded to winter wheat (tile 11TLM). SOS, EOS and Day MV are in DOY 2019
Table 6 Mean absolute deviation (MAD) and standard deviation (MSD) of MAD estimated from S2, S2Cl1 and MODIS over 1km-pixel whose Cl1 corresponded to winter wheat (tile 11TLM). SOS, EOS and Day MV are in DOY 2019

Figure 7 shows the maps for the more homogeneous region dominated by corn fields (tile 15TVH), and Fig. 8 shows the maps for the more heterogeneous landscape dominated by winter wheat (tile 11TLM). Masked areas correspond to water or urban pixels.

Fig. 7
figure 7

LSP indicators for the year 2019 estimated from S2Cl1, S2 and MODIS at 1km-pixel over a more homogeneous agricultural region (tile 15TVH). SOS, EOS and Day MV are in DOY 2019. Masked areas correspond to water or urban pixels

Fig. 8
figure 8

Phenological indicators for the year 2019 estimated from S2Cl1, S2 and MODIS at 1km-pixel over a more heterogeneous agricultural region (tile 11TLM). SOS, EOS and Day MV are in DOY 2019. Masked areas correspond to water or urban pixels

Starting with the corn field maps, LSP metrics maps reveal that the region is highly spatially and temporally homogeneous. This is probably best visible in the SEOS and EOS maps. All three maps show the same pattern with a pronounced SOS around DOY 165–169 (half of June) and EOS around DOY 278 (beginning of October). These numbers are in agreement with the typical corn growing patterns in the Corn belt region [88]. The maximum DVI value (MV) and amplitude show some more variation. Here slight discrepancies between S2 and MODIS can be noticed, with S2 closer to the reference maps of S2Cl1. The thin blue line in some of the S2 maps is due to border artifacts of those S2 captures covering the tile only partially, which generate local discontinuities in time that ripple along the pixel time series and affect LSP estimation. In order to eliminate these effects, these partial acquisitions should be either filtered out from the collection or processed with morphological erosion operators to modify the boundary contours.

Conversely, the more heterogeneous landscape with croplands of winter wheat but also grasslands and shrubland, display more inconsistencies in the LSP metrics maps among the three data sources. While SOS still provides consistent patterns, with a SOS around DOY 85–91 (end of March), the EOS map is remarkably less consistent. Here, S2 still provides the same patterns as S2Cl1 (EOS half of September), while the MODIS data shows a systematic later EOS (end of September). Noteworthy is that S2Cl1 maximum values (MV) and amplitude (Amp) maps provide regions with more pronounced higher values than S2 and MODIS. Both S2 and MODIS deliver smoother, more blurred maps, which again must be attributed to the greater heterogeneity in vegetation cover.

Finally, in order to improve the understanding of the LSP maps, it is worth inspecting the temporal profiles of the three data sources more closely, and relating them to the land cover heterogeneity, i.e., as expressed by the Gini-Simpson index (GSI). Figure 9 shows the temporal profiles of the three data sources for two pixels with contrasting GSA values: low for a corn field pixel and high for a winter wheat pixel. These temporal profiles help also to understand how the LSP indicators are calculated.

Fig. 9
figure 9

Phenological indicators estimated from S2Cl1, S2 and MODIS for corn (low-GSI) [left] and for winter wheat (high GSI) [right]

When having a closer look to the homogeneous corn fields (Fig. 9, left), the temporal profiles for S2Cl1, S2 and MODIS are shown in the top. A first observation is that the MODIS dataset is generally spikier, which is likely due to the higher temporal resolution, with more chances of observing inconsistencies, e.g., due to undetected cloud issues such as partial cloud cover. Regardless of the noise, the general temporal patterns of the three data products resemble closely. Accordingly, when the phenology indicators are calculated, they are alike. That is also shown in the individual calculation of SOS and EOS for each data source (see Fig. 9 underneath). For three data sources the SOS and EOS were identified at about the same dates. Conversely, for the more heterogeneous winter wheat landscape with a high GSI (Fig. 9, right) the temporal profile of the wheat crop (S2Cl1) follows a distinct pattern when compared to MODIS and S2 at 1 km patterns. Winter wheat has an earlier peak as opposed to the other phenology patterns. A closer inspection of the individual SOS and EOS calculations reveals that both MODIS and S2 express a smoother and longer pattern due to mixture of vegetation types (summer crops, grasslands or shrublands) with subsequent similar identification of SOS and EOS. The result suggests that independently of source, dataset at 1 km should be less related to crop phenology quantification and thus more care is required when interpreting this information towards crop monitoring.

5 Discussion

Having outlined a general overview of EO missions and time series processing technique applied to crop monitoring, this section provides a brief overview of a few ongoing trends with respect to satellite-based crop monitoring. They are summarized into the following topics: (1) trends in EO missions; (2) trends in gap-filling methods; (3) trends in time series data fusion, and (4) trends in time series software.

When it comes to EO imagery for crop monitoring purposes, a trade-off has to be made between spatial and temporal resolution. It does not come as a surprise that spatial resolution is a key factor to consider in phenology detection, given that medium to coarse spatial resolution imagery from sensors such as MODIS or Sentinel-3 are comprised of pixels containing a heterogeneous mosaic of multiple land cover types with varying phenological signals [8]. The impact of heterogeneity has been demonstrated here for the winter wheat case within MODIS pixels. Hence, coarse resolution data limits the extraction of specific phenological stages for specific land cover types given this sub-pixel land cover heterogeneity [89, 90]. However, in the extensive review by [8], it was also argued that the spectral-temporal signal at the coarse spatial sale is more stable over longer periods of time because the land cover composition within pixels at a resolution of 1 km or lower remains relatively static from year to year compared to higher spatial resolution pixels (e.g., S2) that detect common short-term land cover changes such as crop rotations. The study presented here just analyzes one growing season for one year, and therefore that statement cannot be confirmed, yet it is true that nominal S2 resolution (20 m) is well able to capture crop rotations (see also [57, 58]). At the same time, there is an ongoing tendency to move towards maximizing spatial and temporal resolution by making use of multiple satellites. i.e., constellations. This was first initiated with the two similar NASA satellites (Terra and Aqua) that both are equipped with the MODIS sensor [91]. The same concept of launching multiple satellites was repeated with the two S2 and Sentinel-3 constellations [92]. Progressing further along this trend, worth noting is the recent CubeSat initiative from Planet Labs, a private Earth imaging companyFootnote 4. For the last few years Planet Labs designed and launched a constellation of CubeSats of more than 100 units. It forms the largest satellite constellation in the world that provides a complete image of Earth once per day at 3–5 m spatial resolution. Their goal is to image the entirety of the planet daily to monitor changes and pinpoint trends. With such an unprecedented richness of spatiotemporal information, first initiatives are underway to estimate phenology stages at fine spatial resolution over the US Corn Belt and so provide significant advancement to crop monitoring and precision agriculture [93].

When it comes to EO imagery time series processing, there is a strong ongoing trend towards embracing artificial intelligence methods. Particularly the machine learning (ML) fitting algorithms entered as attractive alternatives of conventional gap-filling functions. Not only may ML methods lead to more accurate reconstructions (see [57, 58] for a quantitative comparison), but they are also adaptive towards unevenly spaced data over multiple seasons. The GPR used in this chapter is of special interest, as its associated uncertainty estimate provides per-pixel information of the gap-filling confidence. Typically, the longer the gap between two consecutive input samples, the higher the uncertainty. Another interesting method is Whittaker smoother, being almost as accurate as GPR and much faster (results not shown). Its adaptive fitting performance was already earlier reported [56, 94]. It must also be remarked that the multiple provided gap-filling techniques offer, to a greater or lesser extent, different performances. Each method has its own advantages and drawbacks, which depend strongly on the characteristics of the input time series [8, 94], i.e., a method that fits well with some data can be unsuited for a different set of data points. Concerning the appropriate length of time series, even if there is no limit of amount of data, the accuracy of the time series reconstruction increases with the data size. The main limitation of the interpolation module is the high time consuming and computational cost of specific algorithms. Although not the slowest method within the family of ML fitting methods (see [57]), also GPR becomes computationally inefficient in its standard per-pixel usage when processing time series of full images, mainly due to GPR training rather than fitting step. To mitigate this computational burden, it was recently proposed to substitute the per-pixel optimization step with the creation of a cropland-based pre-calculations for the GPR hyperparameters θ [95], which basically rule the way training samples contribute to time series reconstruction depending on their distribution along the time axis. The results of this optimized approach showed that accuracies were on the same order (at most 12% RMSE degradation), whereas processing time accelerated about 90 times. The alternative option of using the same hyperparameters for all the pixels within the complete scene was further evaluated. It led to similar overall accuracies over crop areas and computational performance. Hence, it means that calculating in advance and fixing θ substantial gain in run-time can be achieved in time series reconstruction while maintaining the advantages of GPR, i.e., a high accuracy and provision of associated uncertainties.

While in this chapter only single-source imagery time series datasets were addressed, among the most exciting progress in time series analysis involves multi-source data fusion. Data fusion is being increasingly used to generate time series with high temporal and spatial resolutions [41, 96]. Data fusion algorithms are expected to generate fine resolution synthetic images based on infrequent observations at fine resolution and relatively frequent coarse remote sensing data with relatively higher temporal resolution [8]. ML methods are particularly promising for data fusion, and one of the most attractive fusion methods involves the multi-output (MO) version of GPR (MOGPR). This MOGPR approach was firstly introduced in [58] to fuse optical (S2) and radar (Sentinel-1) data for improved spatiotemporal reconstruction of vegetation products such as leaf area index (LAI). This approach proved to be particularly advantageous for long gapped time series, such as prolonged cloud clover, where optical data alone notoriously fails. Hence, the data from cloud-penetrating radar technology kicks in as complementary information, although the relationship between radar and vegetation phenology is less obvious, and strongly depends on structural properties. The absolute novelty of the solution proposed in [58] is that the parameters of the trained model implicitly predict the meaningfulness of any fusion approach: they quantify the amount of information shared between the two-time series and rule the interaction of low- and high-frequency GPs for output reconstruction. Moreover, the LAI data gap filling described in [58] is only one example of MOGPR possible applications. In fact, with MOGPR multiple datasets can be fused, so to say, that is not restricted to two data sources. Any set of time series collection can be entered into the MOGPR, i.e., the use of variables from multiple optical and radar data sources, coming from multiple satellite missions, e.g., Landsat, SPOT, the Sentinels, MODIS, can be envisaged, as long as they all share a certain amount of information and are georeferenced on a common grid. This data is nowadays easily accessible on cloud-based platforms such as the Google Earth Engine. Accordingly, in the present era of freely available, continuous multi-source satellite data streams, there is no doubt that fused time series processing will become indispensable in producing accurate cloud-free data and subsequent vegetation phenology monitoring.

Finally, to the benefit of the broader community and users in the agricultural sector, another interesting trend is that increasingly dedicated software packages become available for image time series processing and phenology-related studies. As reviewed by [8], the best known, and first software package is TIMESAT [68]. Subsequent software packages are variations and extensions of it or written in other (open-source) languages, such as: Phenological Parameters Estimation Tool (PPET [97]), enhanced TIMESAT [98], TimeStats [99], Phenosat [100], HANTS [101], CropPhenology [102] and QPhenoMetrics [103]. These software tools provide free functionalities for the reconstruction of time series data and extraction of phenological information customized with a number of user-defined input parameters based on time series data (e.g., vegetation indices). They are applicable in data reconstruction providing multiple common data gap-filling methods like logistic models, Savitzky–Golay, asymmetric Gaussian functions, piecewise regression, Fourier transforms etc. and generally perform well in general LSP extraction (e.g., SOS, EOS) providing common extraction methods, e.g., threshold method and inflection method [8]. It must hereby be remarked that all these software packages include the same established gap-filling algorithms. Apart from being equipped with these algorithms, the newly presented DATimeS software package [57] complements with a suite of versatile ML fitting algorithms. In comparison to other time series software packages, DATimeS is state of the art, through the: (1) ability to process unevenly spaced satellite image time series; (2) possibility to select over multiple ML fitting methods for time series prediction (some methods include associated uncertainties, e.g., GPR); (3) option to fuse multiple data sources with MOGPR, and (4) provision and analysis of phenological indicators over multiple growing seasons.Footnote 5

6 Conclusions

Satellite imagery has become an essential source of information to enable monitoring agricultural lands. Specifically, optical data from EO image time series at high temporal resolution can assist in seasonal crop monitoring, as it provides key information about vegetation growing stages over large areas. In this chapter, the ongoing trends in image time series processing for the extraction of information about land surface phenology (LSP) metrics to quantify the key moments of the crop growing season is discussed. Identified trends go in the directions of: (1) a tendency towards constellation of multiple satellites to reach both a high spatial and temporal resolution; (2) adopting machine learning algorithms for fitting multi-year and irregular time series data sources; (3) time series fusion of multiple data sources, and (4) development of dedicated software packages. With the unprecedented availability of EO data and advanced image processing methods, these trends eventually lead to improved quantification of LSP metrics, e.g., start and end of season, but also metrics more related to crop biomass or yield, such as amplitude and area. By making use of the newly developed DATimeS toolbox, the LSP calculation for time series of MODIS and S2 data at 1 km resolution over predominantly (1) homogeneous and (2) heterogeneous agricultural landscapes has been carried out. It is concluded that LSP metrics can be consistently calculated and related to the dominant crop type over a homogeneous landscape. Conversely, heterogeneous regions show some discrepancies in the LSP metrics, which may be a consequence of the more complex landscape with varying phenological behaviors of croplands and natural vegetation, combined with the different temporal resolution of the two sensors analyzed and the role of cloud cover herein. Altogether, given the extraordinary flexibility of current processing algorithms and toolboxes, it can be safely concluded that the same level of maturity is reached in exploiting optical EO data in the temporal domain as in the spatial and spectral domains.