Introduction

The demand for food as well as other agricultural products has expanded globally given the world’s rapid population growth. Nevertheless, climate change challenges and restricted land and water resources are anticipated to constitute a substantial danger to guaranteeing food security. Thereby, these challenges are likely to constrain the attainment of agricultural Sustainable Development Goals (SDGs) (United Nations, 2015), notably SDG-2. This goal includes nutritious food provision, which is closely linked to improved health and a more productive life, as highlighted by Martos et al. (2021). To maintain food security, several variables, including crop growth and yield, require proper monitoring. Moreover, accurate and trustworthy crop type mapping can assist in determining agricultural areas’ proportions and crop variety geographical distribution patterns (Dahhani et al., 2022; Song et al., 2021). This necessitates the measurement of numerous spatio-temporal variables, either directly or indirectly.

Traditional agricultural areas (crop types) identification and monitoring methods are based on field investigation and statistical approaches, which demand significant human, financial, and material resources and are subject to human variables. Currently, remote sensing (RS) technology offers cost-effective, irreplaceable, and considerably faster tools for conducting comprehensive and accurate land cover and crop type’s (LCCT) studies. Indeed, this promising technology has been progressively exploited in various studies (Asgarian et al., 2016; Belgiu & Drăguţ, 2016; Heupel et al., 2018; Luo et al., 2023; Song et al., 2021; Talukdar et al., 2020) due to ongoing platform upgrades (e.g., (Copernicus, 2022; USGS Team, 2022)), free data availability, and, in some cases, permission is available to access high-quality data for education and research programs (Planet Team, 2022). According to He et al. (2022), RS images include a wealth of textural information indicating the spatial distribution structure of ground elements, which aids in crop identification and improves classification accuracy. Therefore, several studies (Chakhar et al., 2021; He et al., 2022; Heupel et al., 2018; Htitiou et al., 2021; Orynbaikyzy et al., 2019; Song et al., 2021) have been carried out in recent years to investigate the RS data products use (e.g., synthetic aperture radar (SAR), optical data) with varying spatiotemporal resolutions (from low to very high) to provide trustworthy information to assist agricultural sustainability.

The supervised classification approach is one of the most often used techniques in LCCT identification, and it performs exceptionally well in RS image classification. Specifically, supervised machine learning (ML) classification algorithms, like Support Vector Machine (SVM) (Cortes & Vapnik, 1995), Decision Tree (DT) and Random Forest (RF) (Breiman, 2001), have been successfully applied in LCCT classification and have exhibited high performance (as illustrated in Table 1). These algorithms, which are able to handle non-linear data, can identify complicated discriminating patterns by being fed high-dimensional data, including original optical/SAR bands, numerous vegetation indexes, and phenological metrics (Löw et al., 2013). For crop types mapping in Spain, Chakhar et al. (2021) examined 22 algorithms, including DT, Nearest neighbor (KNN) and SVM, and observed that the KNN classifier produced the greatest accuracy. In Japan’s agricultural area, Sonobe et al. (2017) applied four algorithms (SVM, RF, multilayer feedforward neural networks mFNN, and kernel-based extreme learning machine KELM). Their results indicated that the KELM algorithm had the highest performance. Wang et al. (2019) examined three algorithms (RF, K-means (KM) and Gaussian Mixture Model (GMM)), and claimed that the RF algorithm achieved the best performance for crop mapping in the United States Midwest. Further, ML approaches enhance accuracy with rising input data dimensions while avoiding the need for human-designed categorization rules. Nonetheless, selecting an appropriate ML algorithm remains challenging since each approach has restrictions that affect its outcomes. Similarly, Zhang and Li’s (2022) recent review highlighted that there are still some challenges to implementing advanced ML, such as selecting and combining the relevant information for accurate land cover mapping. According to Martos et al. (2021) and He et al. (2022), identifying essential features is critical for further training ML algorithms for improved accuracy. Moreover, Orynbaikyzy et al. (2020) demonstrated that feature selection may further improve the ML algorithms’ accuracy, especially once the training samples’ number is restricted.

Table 1 An overview of LCCT mapping research

Numerous studies have shown that LCCT mapping techniques employing time-series data outperform single-date approaches (Azar et al., 2016; El Imanni et al., 2022; Van Tricht et al., 2018). Besides, a recent review by Orynbaikyzy et al. (2019) emphasized that, in several research concentrating on a few crop types (e.g., oilseeds, sugar crops, and cereals), combining optical and radar data with different spatial and temporal resolutions substantially enhanced crop types discrimination findings. Indeed, several studies have combined SAR data (i.e., Sentinel-1 (S1)) with optical imagery, notably Sentinel-2 (S2) (Chakhar et al., 2021; El Imanni et al., 2022; Pott et al., 2021; Sonobe et al., 2017; Van Tricht et al., 2018), Landsat and Moderate Resolution Imaging Spectroradiometer (MODIS) images (Blickensdörfer et al., 2022; Song et al., 2021). For instance, to detect crops in Belgium, Van Tricht et al. (2018) used multi-temporal S1 and S2. Their findings revealed that SAR data might enhance classification accuracy by 4%–14% when compared to S2 alone. Likewise, Blickensdörfer et al. (2022) reported a 6%–10% accuracy gain for crop type identification when combining optical (Landsat-8 (L8) and S2), SAR (S1), and environmental data (i.e., elevation, topographic wetness index, temperature, and precipitation). Moreover, Song et al. (2021) combined S1 images with MODIS, L8, and S2 images to produce an enhanced crop types map. Additionally, Pott et al. (2021) found a 3% improvement in accuracy for mapping in-season crops by integrating S1, S2, and Shuttle Radar Topography Mission (SRTM) data. Despite numerous RS technology developments, few studies have concentrated on LCCT mapping with very high spatial resolution, including PlanetScope (PS, 3 m) and RapidEye (RE, 5 m). For instance, Rao et al. (2021) and Kpienbaareh et al. (2021) produced a crop types map for India and Malawi using S1, S2 and PS combination data. Other investigations conducted over Uzbekistan-Kazakhstan (Löw et al., 2013), Turkey (Ustuner et al., 2014), Germany (Heupel et al., 2018), and Serbia (Crnojević et al., 2014), indicate the RapidEye utility for crop types discrimination. Table 1 highlights different LCCT classification experiences using various locations, algorithms, sensors, and numbers of classes.

Some research in Morocco has reported using multi-temporal optical and/or SAR remote sensing imagery to map land cover and land use (LULC) (Acharki et al., 2021; Hadria et al., 2009; Höpfner & Scherer, 2011; Mohajane et al., 2018). Nevertheless, few researchers (Acharki, 2022; Acharki et al., 2020; El Imanni et al., 2022; Htitiou et al., 2019, 2021; Ouzemou et al., 2018) tried to map irrigated crops in diverse time periods with detailed classes, thus proving the tremendous potential of RS products for accurate crop mapping. For instance, in the Tadla irrigated perimeter (central Morocco), Ouzemou et al. (2018) identified 6 crop types using multi-temporal L8 (October 2013–June 2014) and three machine learning algorithms (Spectral Angle Mapper, Support Vector Machine (SVM) and RF). El Imanni et al. (2022) recently combined S2 and S1 data (September 2020–March 2021) and focused on the identification of 7 crop types and 1 non-cropland in the same area using RF algorithm. Moreover, Htitiou et al. (2021) used multi-temporal L8 and S2 data, along with RF algorithm to identify 10 cropland classes and 2 non-cropland classes in Triffa’s and Tadla ‘s irrigated perimeters. This approach yields an overall accuracy of higher than 90%. This study reveals Sentinel-2’s relevance in producing accurate agricultural crop-type maps in semi-arid regions. Acharki et al. (2020) conducted a first study, in addition to employing SAR and optical time series at various levels for Loukkos’ crop type mapping, combining multi-temporal S1 and S2 data. They discriminated between 13 crop types and 8 non-cropland areas and obtained an overall accuracy of >86%. Moreover, Acharki (2022) recently mapped 6 crop types (fruit trees) and 10 non-cropland classes for 2020–2021 employing multi-temporal L8, S2, and PS data and indicated an overall accuracy of more than 82.2%. Currently, no research in Morocco has investigated the ability of high-resolution imagery, such as RapidEye, to map crop types (e.g. cereals and horticultural crops) at different classification levels. In this research, we employ Support Vector Machine with radial basis function kernel (SVMRB) and Random Forest (RF) as supervised per-pixel classification algorithms to map land cover and crop types (LCCT). These methods were chosen because of their speed, robustness, and ability to obtain the objectives for a certain land cover assignment. In this current research, we investigate (i) the possibility of combining multi-temporal Sentinel-1 and Sentinel-2 data and (ii) the ability of utilizing multi-temporal PlanetScope and RapidEye data to improve LCCT mapping in a Mediterranean irrigated area (Loukkos-Northwestern Morocco). This mapping is carried out using hierarchical nomenclature with three classification levels.

Materials and methods

Study area

The study area covered 645 km2 and is situated in the Loukkos irrigated perimeter (2,572 km2), which is well known for its abundant water and soil resources in northwestern Morocco (Fig. 1). This area’s soil is largely categorized as sesquioxide soils (69.2%), followed by vertisols (10.7%), calcimagnesic soils (8.0%), slightly developed soils (6.6%), and browned soils (5.5%). This study area is predominantly flat plains with relatively moderate elevation (less than 100 m). Furthermore, it has a Mediterranean climate with an annual average temperature of 18.3 °C, and annual rainfall of 617.9 mm (Acharki et al., 2020).

Fig. 1
figure 1

Loukkos perimeter’s study area, illustrating (a) the study area’s location in northwestern Morocco (b) A RapidEye true color composite (band 3-2-1) image acquired on March 28, 2018

Agriculture in the study region is a significant socioeconomic development sector, and it includes crops and fruit trees such as wheat, fodder crops, potatoes, groundnuts, chickpeas, beans, rice, citrus fruits, red fruits, sugar beets, and sugar cane. Barley, olive trees, watermelon, maize, peas, rosacea, lentils, avocado trees, sunflowers, rapeseed, and other legumes and vegetables are also present in Loukkos. In 2018, crop production in this region was 1,534,359.8 t, with 20.3% sugar crops, 1.5% oilseeds, 12.0% cereals, 23.8% fodder, 33.6% market gardening, and 8.8% arboriculture (Mouhssine, 2018). Moreover, this perimeter further assists dairy production and thereby has a considerable hay-producing area. This study focuses on greenhouse crops, groundnut, potato, watermelon, rice, wheat, maize, and fruit trees, among all crop types. Acharki et al. (2020) reported that the landscape linked with these crops varies in terms of vegetation structure (e.g., vegetation height and leaf angle) and cover geometry (e.g., row planting, density, trees), as illustrated in Fig. 2.

Fig. 2
figure 2

Photograhs of crop types taken in the field (2017–2018). a) Maize. b) Potatoes. c) Beans. d) Wheat. e) Greenhouse crops. f) Fruit trees (citrus). g) Watermelon. h) Groundnuts. i) Rice

Remotely sensed data

The methodology used in this research is depicted in Fig. 3.

Fig. 3
figure 3

Flowchart of methodology

Dataset

To map LCCT, we exploit cloud-free multi-temporal data from one SAR sensor (Sentinel-1 Level 1 GRD data) and three optical sensors (Sentinel-2 A and B at Level 1C, RapidEye Ortho Tile and PlanetScope Ortho Scene), covering our study area from September 1, 2017 to August 31, 2018. Medium-resolution Sentinel-1 and Sentinel-2 (10 m) imagery were acquired via the Copernicus Open Access Hub (Copernicus, 2022). Furthermore, high-resolution RapidEye (5 m) and PlanetScope (3.1 m) imagery were acquired via the Planet Explorer website (Planet Team, 2022). The digital elevation model (SRTM), with a spatial resolution of 30 m, required for Sentinel-1 radiometric calibration was obtained using the Earth Explorer (USGS Team, 2022). Overall, a total of 34 images were used: twelve Sentinel-1 images, seven Sentinel-2 images, five RapidEye images, and ten PlanetScope images. Figure 4 illustrates the image acquisition dates from the four sensors.

Fig. 4
figure 4

Data acquisition dates for 2017–2018

Pre-processing

Sentinel-1 imagery

Sentine-1 images were obtained at C-band (λ ≈ 5.6 cm) Level-1 Ground Range Detected (GRD) in Interferometric Wide (IW) swath mode. Although both polarizations (VV and VH) were obtained, only the descending orbit was used in this investigation. In crop studies, Chakhar et al. (2021) stated that ascending and descending mode data behaved remarkably similarly. Python scripts based on the Orfeo Toolbox software (OTB, 2022) were applied to these images, as outlined by Frison and Lardeux (2018). The calibration, which is one of the most important Sentinel-1 processing steps (Acharki et al., 2021; Frison & Lardeux, 2018; Lopes et al., 2020), consists of converting the digital values to numerical values to obtain the backscatter coefficient (sigma zero (σ0) in decibels (dB), [Eq. 1]). Subsequently, the SRTM was employed to correct the geometric distortions caused by changes in the satellite’s location and attitude. Finally, a multi-temporal filter was generated in order to eliminate speckle effects (Quegan & Yu, 2001). This filtering method is particularly adapted to time series analysis consisting of a large number of images, such as those acquired by Sentinel-1 (Frison & Lardeux, 2018). According to Baghdadi and Zribi (2017), this method can reduce speckle noise while preserving image spatial resolution. Besides, the processed images were in dual polarization (VH and VV) and had a VH/VV ratio (cross polarization ratio). Many studies (Blickensdörfer et al., 2022; 2021; Song et al., 2021) have explored crop-type classification methods using VH/VV ratio and revealed that this ratio improves SAR image quality.

$${\sigma}_{dB}^0=10\ast {\mathit{\log}}_{10}\left({\sigma}^0\right)$$
(1)

Sentinel-2 imagery

Sentinel-2 A and B images were acquired at Level 1C (top-of-atmosphere, TOA), implying that they have been geometrically and radiometrically corrected, but not atmospherically corrected (Acharki et al., 2021; Lopes et al., 2020). The atmospheric correction consists of transforming the numerical number DN into top-of-atmosphere (TOA) radiance and subsequently converting TOA reflectance into surface reflectance values. This process was carried out to generate a Level-2A (bottom -of-atmosphere, BOA) reflectance image using the dark object subtraction algorithm (DSO1) (Goslee, 2011). Moreover, ten spectral bands were exploited in this research, comprising four 10 m bands. Numerous researchers revealed that these ten bands are the most designed for land applications (Acharki, 2022; Lopes et al., 2020) and that Sentinel-2’s red-edge bands were beneficial for crop classification (El Imanni et al., 2022; Song et al., 2021). In contrast, Sentinel-2 bands 1, 9, and 10 were eliminated due to their sensitivity to aerosols and cirrus clouds, as well as their low spatial resolution (60 m) and irrelevance for crop type identification (ESA Team, 2015). Furthermore, to enable integration and consistency, the six bands collected at a spatial resolution of 20 m (5, 6, 7, 8a, 11, and 12) were resampled to 10 m using bilinear interpolation. According to Stam and Fung (2011), this interpolation approach was chosen over the nearest neighbor since it produced smoother interpolation and enhanced overall image quality.

RapidEye imagery

RapidEye Level 3A images with five spectral bands were selected. These bands are orthorectified products with radiometric, geometric and sensor corrections. At nadir, RapidEye data has an initial spatial resolution of 6.5 m, whereas the RapidEye Ortho Product was provided with a resampled spatial resolution of 5 m. Previous research (e.g., (Ustuner et al., 2014)) demonstrated that RapidEye’s red-edge (sensitive to chlorophyll content and enabling enhanced class separation) and near-infrared bands were specifically designed for monitoring and identifying crop varieties in agricultural regions on a regional and/or global scale.

PlanetScope imagery

Dove Classic, the initial version, provided four spectral bands (RGB and NIR). Nevertheless, Dove-R and SuperDove (newer generations since 2019) supplied eight spectral bands. In this research, four spectral bands in Level 3B that capture radiation between 455 and 860 nm were used. These bands are subjected to orthorectification, along with radiometric, geometric, and atmospheric corrections, in order to obtain surface reflectance values. This provides more consistency throughout time and space while reducing ambiguity in the spectral response (P.B.C. Planet Labs, 2021). Its worth noting that this method performs atmospheric correction based on 6S radiative transfer model with ancillary data from Moderate Resolution Imaging Spectroradiometer (MODIS) (P.B.C. Planet Labs, 2021).

Ultimately, all pre-processed images were clipped to match the study area’s borders. Subsequently, these images were combined into multi-band stacks using a virtual raster (VRT) approach to produce the datasets mentioned in Table 2. Specifically, each band represents a different spectral or radar channel. Thus, this approach allowed us to merge data from Sentinel-1 and Sentinel-2 while preserving their original details intact. It is noteworthy that all preprocessing steps were implemented in the R programming language (CoreTeam, 2022) and the R Studio (version 4.2.1) integrated development environment.

Table 2 Different used remote sensing data characteristics

Training samples collection

We generated spatially-distributed training sample data using (i) geotagged photographs from field surveys conducted in 2017–2018, (ii) Google Earth (Digital Globe) imagery, and (iii) our field expertise. To achieve a robust and balanced dataset, we adopted a stratified random sampling design, wherein the map classes were subdivided into separate sub-areas or strata depending on particular criteria, such as land-cover types. From within each strata, samples were then randomly selected. We chose this commonly used approach to enhance the precision and reliability of our classification results, as recommended by Stehman and Foody (2019). Likewise, we considered a three-level hierarchical classification, as presented in Fig. 5. The first level concerns five non-cropland (land cover) classes and one cropland class (Level I), which are commonly applied in the literature. In the second level (Level II), the cropland classes have been subdivided into nine crop types in order to see their detailed discrimination potential. It should be pointed out that crop types’ classes were chosen depending on their availability during the field surveys. Some of these crops are displayed in Fig. 2. In the third level (Level II), we combine land cover (non-cropland) classes with crop types (cropland) classes. Thus, this allowed us to retain six, ten, and fourteen classes for level I, level II, and level III, respectively. Based on this data, a spatial distribution of 7,521 samples covered the area in 2017–2018, with more than 500 sample points collected in each class (as shown in Table 3). These sample datasets were then randomly partitioned into two parts: 70% for classification model initialization and 30% for model validation and performance evaluation.

Fig. 5
figure 5

LCCT three-level hierarchical classification

Table 3 LCCT classes and number of samples used for classification

Classification and accuracy assessment

The supervised pixel-based classification for LCCT was performed using two well-known machine learning algorithms: Support Vector Machine with radial basis function kernel (SVMRB) and Random Forest (RF). To train these algorithms, 5-fold cross-validation was applied, which prevented overfitting concerns. RF and SVMRB classification procedures were conducted using R programming language (CoreTeam, 2022). Further, a total of 30 thematic maps have been produced.

Random Forest (RF) algorithm is a well-established supervised statistical classification approach that, in fact, has become the standard in many fields, most notably to classify/identify/map land cover/land use and crop types (Acharki, 2022; Acharki et al., 2021; Blickensdörfer et al., 2022; Kpienbaareh et al., 2021; Ouzemou et al., 2018; Van Tricht et al., 2018). Technically, it is a non-parametric algorithm (Breiman, 2001), involving the combination of decision trees and an aggregation technique(Breiman, 2001; Rodriguez-Galiano et al., 2012). Indeed, RF generates exceptionally minor generalization errors (Breiman, 2001). This characterizes it as having low sensitivity to noise or overtraining. Besides, RF algorithm is able to process high-dimensional RS data and has the capacity to determine important variables (Rodriguez-Galiano et al., 2012; Van Tricht et al., 2018). Similarly, it is computationally more robust and has exhibited high precision in comparison to alternative algorithms (Acharki, 2022; Lopes et al., 2020; Zhang et al., 2020). Other advantages and limitations have been indicated in Belgiu and Drăguţ (2016). Overall, RF algorithm necessitates the configuration of two main parameters: Ntree (number of trees to grow in the ensemble) and Mtry (number of features used in each split) (Belgiu & Drăguţ, 2016). Based on previous studies (Belgiu & Drăguţ, 2016; Pelletier et al., 2016), Ntree and Mtry stand out as the most crucial RF parameters given their potential to exert a substantial influence on the classifier’s performance. In this research, we used the default values and set Ntree to 25 trees and Mtry to 25 as mentioned by Acharki (2022). These values, according to Lopes et al. (2020), provided a reasonable balance between classification accuracy and calculation time.

Support Vector machine, is a nonparametric statistical algorithm employed for binary classification. Several studies have demonstrated that SVMs can classify satellite images with reasonable accuracy (Acharki et al., 2021; Chakhar et al., 2021; Ghayour et al., 2021; Löw et al., 2013; Rao et al., 2021; Ustuner et al., 2014). Specifically, SVM algorithm classifies linearly separable data by selecting the optimal N-dimensional hyperplane that best separates two classes (Cortes & Vapnik, 1995). Detailed information on SVMs can be found in Cortes and Vapnik (1995). It is worth mentioning that SVM algorithm has five widely used kernel functions, including radial basis function kernels, linear kernels, sigmoid kernels, Laplacian kernels, and polynomial kernels. In this research, a radial basis function kernel (RBF) was adopted for its computational efficiency. Several studies revealed that RBF, which is an exponential function, is the most prevalent approach for crop classification and delivers greater accuracy than other traditional techniques (Löw et al., 2013; Talukdar et al., 2020; Thanh Noi & Kappas, 2017). Conversely, Ghayour et al. (2021) concluded that linear kernel provides the best classification accuracy compared to other SVM kernels. Besides, the SVM classifier’s performance is based on the input parameters such as the optimal cost, or regularization, or penalty parameter (C) and the kernel width parameter (γ). According to Thanh Noi and Kappas (2017), high C values result in significant penalties, which might result in overfitting. High γ values, on the other hand, tend to overfit the training data. Therefore, the kernel type and parameters were set according to Ghayour et al. (2021) findings.

To guarantee reliability and validity, each classification map was evaluated using confusion (error) matrix approach (Foody, 1992). Moreover, precision (producer’s accuracy) [Eq. (2)] and recall (user’s accuracy) [Eq. (3)] were computed to measure the precision of individual classes. Precision, also known as positive predictive value, refers to omission errors and is defined as the number of correctly classified items relative to the total items’ number in that class within the classification. Recall, on the other hand, relates to commission errors and indicates the number of accurately classified items relative to the total items’ number in that class in validation samples. In addition, F1-score [Eq. (4), (Van Rijsbergen, 1979)], which is determined by taking precision and recall harmonic mean, was employed to evaluate classifier appropriateness by class. Otherwise, overall accuracy (OA) [Eq. (5)] and kappa coefficient [Eq. (6), (Foody, 1992)] were used for the global accuracy assessment. The overall precision, obtained by dividing the total correctly labeled samples by the total number of tested samples, determines its overall efficiency. The Kappa coefficient indicates the agreement degree between sample data and predicted values (Foody, 1992).

$$P=\frac{TP}{\left( TP+ FP\right)}$$
(2)
$$R=\frac{TP}{\left( TP+ FN\right)}$$
(3)
$$F1- score=2\times \frac{P\times R}{P+R}$$
(4)
$$OA=\frac{\sum_{i=1}^r{x}_{ii}}{x}$$
(5)
$$kappa=\frac{N\sum_{i=1}^r{x}_{ii}-\sum_{i=1}^r\left({x}_{i+}\times {x}_{+i}\right)}{N^2-\sum_{i=1}^r\left({x}_{+i}\times {x}_{+i}\right)}$$
(6)

where TP, FN and FP: Number of pixels accurately classified, pixels incorrectly unclassified, and pixels incorrectly unclassified in class i, respectively. P: Precision, R: Recall, Xii: Number of pixels correctly classified, N: Total number of pixels in the confusion matrix. Xii: Diagonal elements in error matrix. x: Total number of sample in error matrix. \(\sum_{i=1}^r\left({x}_{+i}\times {x}_{+i}\right)\): Sum of row’s total column totals. \(N\sum_{i=1}^r{x}_{ii}\): Total sum of appropriate samples.

In this research, during post-classification processing, we employed filtering to reduce the noise’s effect and thus enhance the uniformity of LCCT classification. Specifically, a majority filter was utilized and implemented through the SAGA Qgis software (version 3.24.1). This filter was configured with a one-pixel radius and a square search mode.

Results and discussion

Accuracy analysis

An accuracy evaluation was conducted to evaluate the effectiveness of LCCT identification. The statistical metrics (overall accuracy and kappa coefficient) for five sensor models (Sentinel-1 (S1), Sentinel-2 (S2), Sentinel-1 and Sentinel-2 combined (S1S2), RapidEye (RE) and PlanetScope (PS)), two machine learning algorithms (RF and SVMRB), and three classification levels are summarized in Table 4. Thus, the results indicate that classification performance spans from 77.35% to 97.87%, with kappa values ranging from 0.73 to 0.96. Furthermore, these accuracies are comparable to, or even better than (Chakhar et al., 2021; Kpienbaareh et al., 2021; Rao et al., 2021; Van Tricht et al., 2018), those of supervised machine learning methods performed by previous studies in different areas (Azar et al., 2016; El Imanni et al., 2022; Htitiou et al., 2019; Song et al., 2021; Sonobe et al., 2017). Conversely, they were slightly lower than those reported in the same area by Acharki et al. (2020) considering Sentinel-1 and -2 time series (2017-2018). This variation could be attributed to differences in the quantity of images, training samples, and classes used. According to He et al. (2022) and Song et al. (2017), ML algorithms’ performance is affected by both the types of classification features and the size of classification indexes. Moreover, Htitiou et al. (2019) claimed that number of training samples influences classification accuracy. Löw et al. (2013) stated that as the number of features rises, classification accuracy may decrease, a concept called the ‘Hughes effect’. Additionally, other research (He et al., 2022; Orynbaikyzy et al., 2019) has found that the distinguishing information among different land covers is present within in a low-dimensional feature space. They also concluded that including more images often provide minimal information although increasing the computational complexity (Orynbaikyzy et al., 2019). However, comparing accuracy data directly with previous studies is challenging due to the variations in sensors, classes, and classification algorithms, as mentioned by Blickensdörfer et al. (2022).

Table 4 Statistical metrics for each classification level, algorithm, and sensor

The results reveal that SVMRB achieved the highest overall classification accuracy of 78.46%–96.49%, independent of classification level or sensor type. RF ranks last, lower than SVMRB by 0.49%–3.55%. These findings align with earlier research, suggesting that SVM is recommended for supervised classification in most cases (Ghayour et al., 2021; Löw et al., 2013; Rao et al., 2021; Ustuner et al., 2014). On the other hand, they argued that SVM outperforms owing to its ability to solve overfitting complexity and its suitability to handle smaller datasets with high dimensionality. In contrast, Ouzemou et al. (2018) mapped six crop types in the Tadla irrigated perimeter and found that RF slightly exceeded SVM by 4% in overall accuracy. In terms of level classification, the findings demonstrate that Level I accuracies exhibit a slight improvement compared to Level II and Level III. This means that as the number of classes decreases, accuracy tends to increase. According to Acharki et al. (2020), the class classification complexity increases with the number of classes since subclasses exhibit less consistent behavior and have a limited training samples’ number.

It is worth noting that S2 has greater accuracy than S1, with discrepancies ranging between 10.31% and 16.21%. Sentinel-1 has the lowest accuracy (77.35%) regardless of the classification algorithm. This finding is consistent with previous work (Acharki et al., 2020; Lopes et al., 2020; Pott et al., 2021), which found that classification based on S2 data consistently outperformed classification based on SAR data. Besides, it can be found that, except for Level I, integrating S1 and S2 does not notably enhance crop mapping accuracy over S2 alone (a gain <1.07%), which is in agreement with earlier findings (Acharki et al., 2020; El Imanni et al., 2022; Lopes et al., 2020). For instance, El Imanni et al. (2022) and Acharki et al. (2020) exhibited a similarity between S1S2 performance and S2 performance alone and found a gain of 1.22% and < 1% in Tadla and Loukkos irrigated perimeters, respectively. They proposed that utilizing S2 data alone could produce better performance than using S1 and S2 combined. In contrast, earlier research (Blickensdörfer et al., 2022; Chakhar et al., 2021; Orynbaikyzy et al., 2019, 2020; Sonobe et al., 2017; Van Tricht et al., 2018) discovered that this combination substantially enhanced crop type identification compared to employing single-sensor data. For example, Van Tricht et al. (2018) identified eight crop types for Belgium using RF, S1, and S2 data and emphasized that integrating SAR data boosted overall RF accuracy by 4%–14%.

Similarly, our findings indicate that PlanetScope accuracies surpass RapidEye for crop type discrimination, which can be referred to as an improvement in spatial resolution. When comparing PS to RE models, the accuracy improvement for RF varies from 3.24% to 12.08%. Whereas, the SVMRB accuracy improvement ranges from 2.75% to 11.28%. Notwithstanding the red-edge band inclusion, which is susceptible to plant chlorophyll content, RapidEye was placed second to last after Sentinel-1. This might be due to the fact that the RapidEye collected images did not capture the whole crop types’ phenological development and had the lowest number of features compared to other sensors. Crnojević et al. (2014) noted that RapidEye images obtained in a short time period, which is a very tiny percentage of the overall period containing all selected crop types’ phenological development, are of low quality and cannot be relied on. Nevertheless, previous research has proven that RapidEye contributed the most substantial improvements in crop types classification accuracy, particularly in multi-sensor scenarios (Crnojević et al., 2014; Heupel et al., 2018) and single (Löw et al., 2013; Ustuner et al., 2014) models.

It’s noteworthy that the accuracy obtained by Sentinel-2 and PlanetScope isn’t that different. Specifically, PlanetScope achieved the greatest classification accuracy for Levels II and III (a gap of 0.36%–1.20% compared to S2), while Sentinel-2 was the best model for Level I (a difference of 0.53%–1.24% compared to PS). This is consistent with prior research (Kpienbaareh et al., 2021; Rao et al., 2021) proving PlanetScope’s ability to classify crop types. For instance, Rao et al. (2021) used SVM to identify four Indian crop types and demonstrated that PlantScope surpassed Sentinel-2 by 1%.

Class discrimination comparison F1-score

In this section, the F1-score findings for Level I and Level II are interpreted (Fig. 6). Overall, F1-score comparison for each level, class, sensor, and algorithm shows that the classifications obtained good F1-score values (>70%). This indicates that, with few exceptions, all classes were correctly differentiated by the five sensors used. For example, wetlands, and other (potato and other crops) were sparsely identified by S1-Level I (RE-Level II), indicating that Sentinel-1 data has the weakest discriminative potential. The resulting F1-score values are comparable to earlier research (Htitiou et al., 2019), in which the values obtained were 67%98%. Nonetheless, they are below those achieved by Acharki et al. (2020), who attained mean F1-score values >86.5%.

Fig. 6
figure 6

F1-score computed for five sensors, two algorithms, per level, and each class

Considering algorithm, the average F1-score values, for Level I and RF (SVMRB), are 75.2% (78.8%), 94.4% (96.2%), 94.3% (96.0%), 86.9% (91.2%), and 92.0% (95.6%), respectively, for S1, S2, S1S2, RE, and PS. For Level II and RF (SVMRB), these values are 80.1% (81.0%), 93.9% (94.4%), 94.7% (95.1%), 80.8% (83.2%), 94.6% (95.8%), respectively, for S1, S2, S1S2, RE, and PS. It was clearly shown that SVMRB led to the highest F1-score, even though its performance was comparable to RF algorithm in some classes. Although the SVM classifier requires intensive time and computational resources when dealing with extensive training datasets or numerous features, it demonstrated good F1-score results for LCCT classification. Furthermore, it is evident that the average F1-score decreases as the number of classes increases, with few exceptions.

In terms of sensors, we found that for both levels, Sentinel-1 had the lowest F1-score (>55.9% for RF and > 63.4% for SVMRB). This finding contradicts Orynbaikyzy et al. (2020)'s, which implies that Sentinel-1’s classification precision is greater than Sentinel-2’s. Their results are explained by selecting classes with phenological similarities and focusing entirely on crop type classification and no other land cover class. However, RapidEye was relatively successful in distinguishing between non-cropland (Level I) and cropland classes (Level II) (F1-score > 77.6% for Level I). Furthermore, Sentinel-2 and PlanetScope clearly revealed significant discrimination against all classes. From Fig. 6, it can be observed that the F1-score values of both sensors are nearly identical (a difference < 1%) in several classes. Sentinel-2’s F1-score values outscored those of PlanetScope in certain classes including forest, wetland (Level I), wheat and greenhouse crops (RF-Level II). In other classes, such as potatoes, watermelon, other crops, and fruit trees, PlanetScope’s F1-score values surpass Sentinel-2. Other studies that investigated Sentinel-2 and/or PlanetScope’s potential for land cover and/or crop-type mapping reported similar results (Acharki, 2022; Rao et al., 2021). Acharki (2022) illustrated that greenhouse crops were well distinguished by Sentinel-2, and crops were better identified by PlanetScope. It can be concluded that PlanetScope has a great capacity to detect crop classes since its high spatial resolution corresponds to that of small farms, decreasing the mismatched pixels’ probability at field boundaries. Besides, the discrepancies in Sentinel-2 F1-score values and those of the Sentinel-1 and Sentinel-2 combined are minor, except for watermelon (RF and SVMRB), other crops (SVMRB), and wheat (RF) classes (differences 2.3% and 4.9%). For example, the watermelon was detected more effectively when Sentinel-1 and Sentinel-2 were combined. These findings are compatible with the F1-score findings reported by Lopes et al. (2020) and Acharki et al. (2020).

Considering the individual classes, the water and cropland classes were easily detected among the Level I classes, regardless of the sensor or algorithm. This is related to their distinctive spectral characteristics. Although the urban settlements’ reflectance properties varied from those of vegetation, they were moderately detected by all sensors. Moreover, Level II findings indicated that F1-score for rice, wheat, maize, groundnut, fruit trees, and greenhouse crops, are all more than 96%, especially for PlanetScope and Sentinel-2 using the SVMRB algorithm. This suggests that these sensors have a good discriminating ability to distinguish between these classes. It appears that groundnut has been accurately classified using only Sentinel-1 (93.86%95.80%). Other crops, such as potatoes and watermelons, have lower F1-score values (<93%), implying that they are less distinguishable compared to other classes. This might be because some crops, such as potatoes and watermelons, exhibit similar (spectral) behavior. Previous research has shown that phenological patterns are similar across a wide variety of crop types (Orynbaikyzy et al., 2020; Zhang et al., 2020). For instance, Zhang et al. (2020) observed that there is some misunderstanding between potatoes and maize since they are all dryland crops with similar growth cycles. Conversely, Van Tricht et al. (2018) exhibited that, in Belgium, Sentinel-1 and Sentinel-2 clearly identified potatoes, maize, and sugar beets in late August, while winter cereals were better discriminated against in late June. Similarly, Orynbaikyzy et al. (2020) investigated sixteen crop types in Northern Germany and revealed that potatoes and maize seemed to have the highest F1-score values (0.76 and 0.79, respectively). A lower F1-score value might also be attributable to field size, which can be influenced by mixed pixels at parcel boundaries, as mentioned by Orynbaikyzy et al. (2020). It can be seen that F1-score values, independently of sensor or algorithm, improve when all cropland (or non-cropland) is aggregated into single classes.

LCCT classifications

The classification findings shown in this section are for Level II, which includes one non-cropland and nine cropland classifications. Figure 7 illustrates the spatial distribution of various crops in the study area from 2017 to 2018.

Fig. 7
figure 7

a) SVMRB classification results based on PlanetScope imagery at level II (2017–2018). b) Comparison of RF and SVMRB classification results for the five used sensors in the zoomed area

Maps derived from RapidEye and Sentinel-1 (Fig. 7-b) contain misperceptions between numerous classes (such as potato and groundnut), and they did not seem to correspond to our local knowledge. However, the results from PlanetScope were very comparable to visual interpretation map. Moreover, it was noted that most parcels were accurately identified.

Conclusions

Crop type maps serve as crucial for developing agricultural sustainability policies and are also useful in other disciplines, such as environmental assessments. In this research, we evaluated the possibility of combining SAR (Sentinel-1) and optical (Sentinel-2) multi-temporal data as well as the ability of high-resolution (PlanetScope and RapidEye) multi-temporal data to improve land cover and crop type (LCCT) mapping. For this purpose, LCCT maps with six, ten, and fourteen classes (Level I, II, and III, respectively) were created using two machine learning algorithms (support vector machine with a radial basis function kernel and random forest). A Mediterranean irrigated region in Loukkos, northwest Morocco, was chosen as the experiment site. The results demonstrated that combining Sentinel-1 and Sentinel-2 did not enhance LCCT classification accuracy when compared to Sentinel-2 alone. Besides, PlanetScope’s performances are better than those of other sensors (Sentinel-1, Sentinel-2, and RapidEye), especially for levels II and III. PlanetScope data was able to identify all classes with high accuracy (F-Score > 86%). Furthermore, this research has proven the capability of SVMRB in LCCT classification. It was shown that the classification accuracy increased when cropland or non-cropland classes were grouped into one class. In light of these findings, the resulting crop-type map can be utilized for a variety of purposes, including yield estimation analysis. Lastly, future research could look into the new generation of PlanetScope (with 8 bands) data and other environmental indices to improve crop type classification accuracy.