Introduction

Understanding the area and extent of croplands is important for a number of societal and environmental reasons (Johnson 2013). Remote sensing datasets have been widely applied in identifying agricultural crops. Approaches to land cover classification based on moderate spatial resolution imagery can be divided into two groups (Biradar and Xiao 2011). The first group originates from traditional land classification methods using detailed imagery and is essentially based on the spatial clustering of vegetation indices (VIs). This space-oriented approach is the dominant paradigm, but it relies on the user’s experience to interpret and label a spatial cluster as land cover (Biggs et al. 2006). The second group is the time-oriented approach (time series classification), which is based on the temporal profile analysis of vegetation indices at individual pixels. The time-oriented method has been successfully applied in agricultural crops and crop intensity mapping in recent years (Sakamoto et al. 2005; Wardlow et al. 2007; Galford et al. 2008; Howard et al. 2012; Singh et al. 2012).

Time-oriented methods to identify land cover characteristics can be roughly classified into two types (Zhang et al. 2013). The first type is based on signals observed at different scales decomposed through Fourier transform or discrete wavelet transforms (DWTs). By decomposing data into different frequencies, noise can be excluded and different components reflecting patterns at different scales can be obtained. In one study, different vegetation types were examined and found to have distinct patterns at different scales (Qiu et al. 2013a). Therefore, land cover classification can be achieved based on these long-term trends or seasonal patterns (Sakamoto et al. 2006; Galford et al. 2008; Geerken 2009; Singh et al. 2012). The second method is based on land surface phenological stages, including the green-up, the stable high biomass level, and the senescence periods. Distinct vegetation phenological parameters have long been investigated and widely applied in ecological and environmental studies. The metrics applied for quantifying the characteristics of different land covers generally include Euclidean or Mahalanobis distance measures (Li and Fox 2012; Sun et al. 2012); statistical parameters such as mean, trough, and peak points (Biradar and Xiao 2011; Zhou et al. 2013); and amplitude and standard deviation derived from annual vegetation indices time profiles (Arvor et al. 2011; Bridhikitti and Overcamp 2012). In general, the first type of method utilizes frequency information and the second type takes advantage of the phenological information closely associated with different times/stages.

Despite its capability of exploring temporal characteristics for agricultural crop discrimination, the time-oriented approach is still in its early stages (Nuarsa et al. 2012). The time-oriented approach requires intensive, detailed study of individual land cover types in order to identify unique physical and spectral features of these land cover types over time (Xiao et al. 2002). This approach would be more likely to identify the unique features of each specific crop or land cover type, if we could combine the data on both the frequency and time/stages, i.e., use the first and second integrated classification methods. However, few studies have been conducted in this area. This paper aims to develop a new time-oriented methodology by utilizing information on both the time and frequency dimensions for the purpose of specific agricultural crop classification through continuous wavelet transform (CWT). CWT can provide a great opportunity to describe the temporal and frequency characteristics of each specific land use, through decomposing a signal at a continuum of positions (time) and frequency (scale) instead of the dyadic scale of DWT. It is particularly suited to the elucidation of scale- and time-dependent relationships and has been successfully applied in the hydrological and environmental research fields (Gaucherel 2002; Bashmachnikov et al. 2013). Recently, it has been successfully applied to obtain data on crop intensity (Qiu et al. 2014).

This paper proposes a new approach for crop identification with wavelet variance and JM distance (CIWJ for short). The rest of the paper is organized as follows. In the next section, we introduce the study area and data sources. In “Methodology” section, we describe the procedure of the CIWJ method. In “Results and discussion” section, the application and validation of the CIWJ algorithm is also provided. We expect that the wavelet variance of the time-frequency dimensions can better characterize vegetation phenology and thus be more efficient for agricultural crop discrimination and other related applications.

Materials

Study area

The Heihe River Basin, the second largest inland river basin in the arid region of northwest China, consists of three major geographic units: the southern Qilian Mountains, the middle Hexi Corridor, and the northern Alxa Highland. The middle Hexi Corridor is sandwiched between these mountains and highlands, like a corridor trending from northwest to southeast. It is located in northwest China between 96° 42′–102° 00′E and 37° 41′–42° 42′ N (17,000 km2) (Fig. 1). The middle Hexi Corridor plays an important role in not only providing abundant food and vegetables but also maintaining the ecosystem functions in the arid region. It is characterized as an arid or semiarid temperate climate, with a mean annual precipitation range from less than 100 m in the northern high-plains area to around 300 m in the southern mountainous areas. Monthly precipitation is extremely variable, with rainy summer seasons occurring from June to August, and dry winters. The mean annual temperature varies within 2.1∼8 °C along with the altitudinal gradient of 1,100∼3,600 m. Due to its flat topography, abundant sunlight, and convenient water resources from the Qilian Mountains, this region has been cultivated for over 2,000 years. The region’s main crops are maize, spring wheat, and cole.

Fig. 1
figure 1

Location of study area, spatial distribution map of average EVI and elevation

Data source

In recent years, moderate-resolution imaging spectroradiometer (MODIS) images have been widely utilized because they are capable of maintaining both spatial and temporal density for crop mapping from regional to global scales (Biggs et al. 2006; Galford et al. 2008; Biradar and Xiao 2011; Singh et al. 2012; Sakamoto et al. 2013). An 8-day time series of 250-m MODIS enhanced vegetation index (EVI) datasets in 2012 has been utilized. The MODIS land data products incorporated enhanced atmospheric correction, cloud detection, improved geo-referencing, and enhanced ability to monitor vegetation (Pagano and Durham 1993; Justice et al. 1998). The map projection was converted to Universal Transverse Mercator using ENVI image processing software. From late August to early September 2012, a field survey was carried out to collect in situ data in the study area. These ground-truth data were important for crop identification.

Methodology

A new methodology for discriminating specific agricultural crops and other land use/cover classes based on CWT is proposed (Fig. 2). It includes the following procedures: data preparation, feature selection and similarity calculation, criteria quantification, and accuracy assessment, as described in the following sections. The entire procedure was executed using the Matlab software package (The MathWorks, Natick, MA, USA).

Fig. 2
figure 2

Overview of methodology

Remotely sensed input

A daily continuous EVI time series data was created through the following procedures. First, for each pixel in the study area, each observation (46 observations over the calendar year) with cloud contamination was discarded. Second, an unevenly spaced EVI time series for each pixel was created using the date of the observation flag included in the data product. Finally, the daily continuous EVI time series datasets were produced through linear interpolation for the entire study year. In order to eliminate the edge effects induced by wavelet transform, a total of 3 full-year daily continuous EVI time series datasets were developed through duplication.

Feature selection and similarity calculation

Continuous wavelet transform

The CWT was adapted for investigating the non-stationary process, e.g., the vegetation indices. A continuous wavelet transform was performed on each pixel of the MODIS EVI time series datasets in 2012. The Mexican hat wavelet is a real symmetric function that detects peaks and valleys. It allows for a good description of temporal resolution (Gaucherel 2002) and also has a similar shape as the EVI time profile. The main characteristics of agricultural phenology have to be precisely localized during the growing period. Therefore, the simple and commonly used “Mexican hat” mother wavelet was finally chosen to derive time-related characteristics in this study. However, the Morlet wavelet offers improved detection and localization of scale over the Mexican hat (Mi et al. 2005; Biswas and Si 2011). Therefore, the Morlet wavelet was also selected to obtain scale-related features. The Morlet wavelet has been shown to provide robust and satisfying results for scale analysis in geographical research areas (He et al. 2007; Partal 2012).

Following is a brief introduction of CWT process. A detailed description and application guide of wavelet transform can be found in related references (Daubechies 1990; Torrence and Compo 1998). CWT converts the EVI time series into sets of coefficients by scaling (scale parameters a) and shifting (time-localized parameter b) the mother wavelet function.

$$ {\psi}_{a, b}(t)={\left| a\right|}^{\frac{1}{2}}\psi \left(\frac{t- b}{a}\right) $$
(1)

The continuous wavelet transform of a signal f(t) is given as

$$ {W}_{\psi}\left( a, b\right)={\left| a\right|}^{\frac{1}{2}}{\displaystyle {\int}_R f(t)\psi \left(\frac{t- b}{a}\right)} dt $$
(2)

Create TAWV and SAWV profile

The magnitude of the wavelet coefficients at a time or scale represents the local variance, and the sum of all local wavelet variances is equal to the total variance. Therefore, the sum or average of the wavelet variances can be used to examine the temporal/scale variation of the vegetation dynamic at different times/scales. In order to interpret the local wavelet spectrum, the time-averaged and scale-averaged wavelet variances were selected to examine the overall characteristics of the data. The time-averaged wavelet variance (TAWV) and scale-averaged wavelet variance (SAWV) were calculated to characterize the time- and scale-related variability based on the CWT of the Mexican hat mother wavelet and the Morlet wavelet, respectively. The TAWV could be applied to localize the main characteristics of the vegetation phenology, and the SAWV could be utilized to describe the dynamic frequency and amplitude of different land uses. The standard TAWV and SAWV profiles for each specific land use class were established based on training areas from the field survey sites in large fields with relatively pure land use/cover classes.

The time-averaged wavelet variance is defined as an average of the wavelet energy against the time.

$$ V(a)=\frac{1}{n-1}{\displaystyle \sum_{i=1}^n{\left( W\left( a, b\right)-\overline{w}(a)\right)}^2} $$
(3)

where \( \overline{w}(a) \) is the computed average of the wavelet coefficients for a selected time a.

The scale-averaged wavelet variance is defined as an average of wavelet variance against the scale.

$$ {W}_w(b)=\frac{1}{n-1}{\displaystyle \sum_{i=1}^n{\left( W\left( a, b\right)-\overline{w}(b)\right)}^2} $$
(4)

where \( \overline{w}(b) \) is the computed average of the wavelet coefficients for a selected scale a.

Determine the proper time/scale range based on separability

A feature selection process was applied to derive the proper time or scale range of the time-averaged and scale-averaged wavelet variance profile based on the computation of separability test. The Jeffries–Matusita (JM) distance metric, which provides a flexible and intuitive separability index, was utilized to guide the feature selection process. The class separability among specific crops in the time-averaged and scale-averaged wavelet variance profiles was investigated using the JM distance statistic. The JM distance statistic proved to be an effective measure for this task in related studies (Van Niel et al. 2005; Wardlow et al. 2007; Arvor et al. 2011). The JM distance between a pair of class-specific probability functions is given by

$$ \mathrm{JM}\left({c}_j,{c}_k\right)={{\displaystyle {\int}_x\left(\sqrt{p\left( x\right.\left|{c}_j\right.}-\sqrt{p\left(\left. x\right|{c}_k\right.}\right)}}^2 dx $$
(5)

In this paper, x represents a span of wavelet variances based on the original EVI time series, and c j and c k indicate the two different crop or land use classes under consideration. The JM distance can then be defined as in the following equations. The JM distance ranges between 0 (low separability) and 2 (high separability) and is defined as in Eqs. (6) and (7).

$$ \mathrm{JM}=2\left(1-{e}^{- B}\right) $$
(6)
$$ \mathrm{where}\kern1em B=\frac{1}{8} D+\frac{1}{2} In\left(\left|\frac{{\displaystyle \sum j}+{\displaystyle \sum k}}{2}\right|/\sqrt{\left|{\displaystyle \sum j}\right|\left|{\displaystyle \sum k}\right|}\right) $$
(7)
$$ \mathrm{and}\kern1em {D}^2={\left({\mu}_j-{\mu}_k\right)}^T{\left(\frac{{\displaystyle \sum j}+{\displaystyle \sum k}}{2}\right)}^{-1}\left({\mu}_j-{\mu}_k\right) $$
(8)

μ j and μ k correspond to class-specific expected wavelet variance values, and ∑ j and ∑ k are unbiased estimates for the class-specific j and k covariance matrices (Wardlow et al. 2007). The JM distance, which can range between 0 and 2, provides a general measure of separability between two classes according to their probability (Wardlow et al. 2007). In order to obtain the global separability indices, the weighted average separability distance, J ave was calculated and utilized (Wardlow et al. 2007).

$$ {J}_{\mathrm{ave}}=\frac{1}{m\left( m-1\right)}{\displaystyle \sum_{j=1}^{m-1}{\displaystyle \sum_{k= j+1}^m{\left(\mathrm{JM}\right)}_{j k}}} $$
(9)

where m is the number of classes and JM jk corresponds to the Jeffries–Matusita distance between class j and k.

The Jeffries–Matusita (JM) distance was calculated for different combinations of time (scale) ranges among the standard TAWV and SAWV profiles, respectively, and the proper time or scale ranges were then decided with the largest JM distance achieved. After the best separability time or scale ranges were obtained, curve shape matching was implemented by examining the similarity between the curves of each unclassified pixel and standard TAWV and SAWV profiles. The Jeffries–Matusita (JM) distance in n-dimensional space (see Eq. (5)) between two vectors was applied for this purpose.

$$ {D}_{\mathrm{JM}}=\sqrt{{\displaystyle \sum_{i=1}^n{\left({x}_i-{A}_i\right)}^2}} $$
(10)

where D JM is the JM distance, x i is the time-averaged wavelet variance value of a measured pixel in space i, A i is the average of training samples in space i, and space i represents the time or scale step (8 days) (i = 1, 2, …n).

Similarity calculation

The curve shapes in the TAWV and SAWV profiles were utilized as the major metrics to identify maize, spring wheat, cole, non-agri vegetation, and non-vegetation. For each pixel, the TAWV and SAWV JM distances with these five land uses were calculated on the specific time or scale range, with the best class separability from the time-averaged and scale-averaged wavelet variance profile, respectively.

Criteria quantification

The procedure for criteria quantification is provided in Fig. 3. The TAWV was applied as the dominant metric for land use identification, because it is closely related and very sensitive to vegetation phenology. Therefore, the TAWV JM distance was first applied to assess the similarity between the five land uses and unclassified pixels. Results showed that the JM distance ranged from 0.083 to 2 within the study area. The pixel with the least JM distance to one specific land use was classified as this land use. However, if one pixel was assigned as one specific land use with a TAWV JM distance above a threshold (this paper uses a value of 1.5), it was assumed that it was classified with some uncertainty and needed further investigation. Therefore, the SAWV JM distance was also utilized to evaluate the similarity between the five land uses and unknown pixels. If the classified result was confirmed by the SAWV JM distance, it was accepted. When the classified result based on the SAWV JM distance differed from that from the TAWV JM distance, the result based on the SAWV JM distance was accepted on the condition that the SAWV JM distance between the unknown pixel and the specific land use was less than a threshold (this paper uses a value of 1.5). However, if the minimal TAWV and SAWV JM distances away from any specific land use were both larger than 1.5, and results based on these two JM distances conflicted with each other, the unknown pixel may not be any of these established land uses. In this situation, we could redefine the classification system or alter the threshold, and the unknown pixel could be assigned as another land use.

Fig. 3
figure 3

The procedure of criteria quantification

Notes: D T ‐ 1, D T ‐ 2 …. D T ‐ n and D S ‐ 1, D S ‐ 2 …. D S ‐ n represent the TAWV and SAWV JM distances between unknown pixels and established land uses (N = 1, 2,…n), respectively; θ is the threshold.

Accuracy assessment

In general, accuracy evaluation of moderate-resolution land cover products is a challenging task as a result of overestimates or underestimates connected with the image fragmentation and sub-pixel problem (Biradar and Xiao 2011). Extensive field surveys for site-specific data were generally limited to agriculturally intensive regions, considering budget constraints and human accessibility. We evaluated the MODIS-derived agricultural crops data through classification results obtained from images with relatively higher spatial resolution, the Chinese Environmental Disaster Reduction Satellite (HJ-1 satellite for short) images (Qiu et al. 2013b).

The cloud-free HJ-1 satellite images obtained from different dates (from early July to mid-September) throughout the growing season were collected to identify different land use/cover types (maize, spring wheat/cole, non-agri vegetation, and non-vegetation). A total of six HJ-1 satellite images were applied. They were geometrically corrected and transformed to Universal Transverse Mercator projection. HJ-1 images in summer were applied to discriminate the agricultural crops, non-agricultural crops, and non-vegetation. HJ-1 images in mid-September were exploited to obtain the distribution of maize and non-agri vegetation, which could be easily discriminated from the background because almost no other live crops appear at that time. The land use distribution map in 2012 was determined through visual interpretation. It was resampled to 250-m target resolution by the majority resampling method, in order to be comparable with classification results from MODIS data.

Results and discussion

Temporal variability of the EVI of different land uses

The main differences among agricultural crops, compared with other land uses, are the vegetation phenology. In the middle Hexi Corridor, agricultural crops were generally sown in late March, corresponding to the Waking of Insects (wheat and cole), or early April (seed corn). However, their harvest times are quite distinct. These three main agricultural crops—maize, spring wheat, and cole—are generally harvested in early or mid-October, mid-July, and mid-August, respectively. Compared with agricultural crops, non-agri vegetation such as forests, shrub, grass, and urban vegetation generally has a relatively longer growing season, extending from early April to late October or early December until the arrival of the first frost.

Apart from vegetation phenology, remarkable variations were also observed between agricultural crops and other land use/cover. When agricultural crops are planted in an agricultural field, their land cover varies from bare land to a mixture of bare land and green seedlings from the first month to almost full vegetation in the second, third, and fourth months (depending on different crops) and bare land again after harvest time. However, non-agri vegetation and non-vegetation areas generally have similar land cover over a relatively long period (i.e., for non-agri vegetation: the growing period; for non-vegetation: the whole year).

The EVI temporal profile for maize, spring wheat, cole, non-agri vegetation, and non-vegetation in 2012 is provided in Fig. 4. A distinct separability of vegetation phenology was observed among these five land uses. The EVI time series for non-vegetation is characterized by nearly constant low or relatively low EVI values throughout the year. Non-agri vegetation has a medium annual EVI level and moderate EVI increase during the growing season. In arid temperate regions, the agricultural crops’ EVI profiles generally show a single vegetation cycle corresponding to crop phenology with high EVI values during rainy and warm seasons (summer) and low values during dry and cold seasons (winter). Considering that these three agricultural crops have only one single growing season and similarly low values in the winter, their main characteristics are the amplitude and width of the EVI profiles within their growing period. The highest amplitude of intra-annual variability of agricultural crops was found in cole, followed by maize and spring wheat. The EVI profile of maize is relatively low and wide, corresponding to its considerably longer growing season, which is obviously different from cole and spring wheat. Non-vegetation areas demonstrated very little intra-annual variability.

Fig. 4
figure 4

The EVI temporal profile of maize, spring wheat, cole, non-agri vegetation, and non-vegetation in 2012

Wavelet variance profile of different land uses

A continuous wavelet transform was first performed on the established EVI time profile of maize, spring wheat, cole, non-agri vegetation, and non-vegetation from the training areas in 2012. The time-averaged and scale-averaged wavelet variances were then calculated and averaged for the five land use/cover types (Fig. 5). Similar to the original EVI profile, significantly higher values were observed from the time-averaged wavelet variances of these three agricultural crops, which were distinctly different from non-agricultural land use/cover types. The annual time-averaged wavelet variances of these three agricultural crops were characterized with two troughs, one during spring and one during autumn, and one obvious peak during the summer season. Each agricultural crop also differed from other crops in amplitude and width. This is particularly apparent during the planting season from April to May, when a distinct difference in time-averaged wavelet variances was observed among these three agricultural crops. The highest amplitude of intra-annual variability of agricultural crops was found in maize, followed by cole and spring wheat.

Fig. 5
figure 5

The time-averaged (a) and scale-averaged (b) wavelet variance wavelet variance of maize, spring wheat, cole, non-agri vegetation, and non-vegetation

Similarly, remarkably higher values were found from the scale-averaged wavelet variance profile of these three agricultural crops, followed by much lower values obtained from non-agricultural vegetation and non-vegetation (Fig. 5). An interesting phenomenon was observed from the scale-averaged wavelet variance profile of these three agricultural crops, with distinctly different patterns obtained within the scale range of 5 to 7 months and 9 months to 1 year, respectively. Among these three agricultural crops, relatively lower values were observed from the spring wheat within the first scale range, as opposed to the highest values obtained within the second scale range. It is obvious that maize has a relatively slight dynamic space within the smaller scale (semi-annual scale) and much stronger dynamic space around the larger annual scale. Cole had very high values in the scale-averaged wavelet variance profile at the semi-annual level, which corresponds to its high peak values and short growing period observed in the annual EVI time profile. The time-dependent and scale-dependent variations of the five land use classes from the training sites revealed by the TAWV and SAWV profiles acted as indicators of varying vegetation cycling processes. Therefore, given their unique TAWV and SAWV profiles, their curve shapes could be utilized as the major metric with which to identify maize, spring wheat, cole, non-agri vegetation, and non-vegetation.

Classification of agricultural crops

Because the separability of both TAWV and SAWV between agricultural crops and non-agricultural crops appeared to be significant during visual assessment, a separability evaluation was performed among the three agricultural crops. The best separability for TAWV was obtained in the whole year from January to December. Therefore, the JM distance for the TAWV profile of the whole year was applied to assess the similarity between the five types of land use and unclassified pixels. For the scale-averaged wavelet variance, the best separability was obtained at the scale level of 4 to 9 months (136∼280 days). The JM distance was calculated on this scale range of the scale-averaged wavelet variance profile for the same purpose. The classification results of agricultural crops using the methodology developed in this study are given in Fig. 6.

Fig. 6
figure 6

Agricultural field map derived from a CIWJ method based on MODIS images compared with b HJ-1-derived data

Accuracy assessment of agricultural crops

The spatial distribution of three agricultural crops and other land uses was visually assessed in comparison with the HJ-1 image-derived products. We found that the spatial distribution of our crop classification map exhibited close similarity to that of the map interpreted from the HJ-1 images (Fig. 6). Quantitatively, Table 1 showed the confusion matrix between MODIS classifications using wavelet variance and JM distance and the HJ-1 image-interpreted data. The producer accuracies of maize and non-vegetation were 85.8 and 97.1 %, respectively. The total accuracy was 83.6 %. The kappa agreement was 0.7009, indicating good agreement. The producer and user accuracies of wheat/cole were much lower due to misclassification of wheat/cole to non-agri vegetation. The producer and user accuracies of non-agri vegetation were also considerably lower due to misclassification of non-agri vegetation to wheat/cole.

Table 1 Accuracy assessment

A confidence map was also produced for the MODIS-derived classification map corresponding to the percentage accuracy of each respective pixel based on the method proposed in this study. The confidence map revealed the percentage certainty (0∼100 %) with which each pixel was assigned a specific crop type. The calculation of the confidence map was based on TAWV JM distance of its assigned land use class for each pixel (SAWV JM distance could also be applied depending on which metric was mainly utilized). If the JM distance of its assigned land use class for one specific pixel was less than 1.0, its value on the confidence map was given as 100 %; otherwise, it was calculated as C = (2 − A) × 100 %, where C denotes its value on the confidence map, and A represents the JM distance of its assigned land use class for one specific pixel. If pixels classified with less than 90 % confidence were excluded, the producer and user accuracies of wheat/cole, non-agri vegetation increased to 70.6 and 75.9 %, respectively. The total accuracy and kappa index increased to 89.3 % and 0.8151, respectively.

Spatiotemporal explicit information on crop distribution is critically needed for researchers and policy makers, but it is not included in existing large-area land cover datasets (e.g., the MCD12Q1 land use/cover product). More efficient time-series classification methods are urgently needed for remote sensing applications. Through transforming the daily MODIS EVI time series into a time-frequency wavelet spectra and further into TAWV and SAWV profiles, the CIWJ method can efficiently characterize different crops or land uses at both time and frequency dimensions. Compared with methods that utilize information on either frequency or time dimension (the first or second type of method as described in the “Introduction” section), the CIWJ method exhibits tremendous potential for efficient crop mapping. The CIWJ method enriches current time-series classification methodologies.

The CIWJ method was proven to be capable of mapping crop fields in arid regions. It can also be applied to humid regions. Each agricultural crop or specific land use has a particular vegetation dynamic pattern, which can be examined from corresponding TAWV and SAWV profiles. Before applying the CIWJ method, a detailed investigation should be carried out to evaluate how the TAWV and SAWV profiles might differ with various crop calendars and natural conditions (crop varieties, climate conditions). Considerable adjustments could be made to accommodate possible variability across different regions. For example, if the greening-up dates of a particular agricultural crop are affected by a relatively warmer climate, we might apply the scale-averaged wavelet variance as the main metric, which is robust to this phenomenon. In all cases, the combined utilization of both the TAWV and SAWV profiles can better characterize the vegetation dynamics that correspond to different agricultural practices.

Conclusions

A novel crop/land use mapping method was proposed in this study through combined utilization of both time and frequency information. The CIWJ method included the following procedures: (1) employing the Mexican hat and the Morlet continuous wavelet to a time-frequency representation, the wavelet spectra; (2) calculating the TAWV and SAWV to indicate the temporal and scale dimensions of the wavelet spectra, which can be closely related to the time or frequency dynamic of each specific agricultural crop or land use class; (3) obtaining the best separability between established land uses based on the JM distance of the TAWV and SAWV JM profiles; and (4) computing the TAWV and SAWV JM distances between the training sites and unknown pixels at specific scale/frequency ranges with best separability, which could be utilized as the main indices for specific crop/land use discrimination. The proposed methodology was successfully applied to crop classification in the middle Hexi Corridor in northwest China based on MODIS EVI datasets. The results show a close spatial correspondence to the HJ-1-derived data, with an overall accuracy of 83.6 % and kappa coefficient of 0.7009.

The CIWJ method was found to be a very promising method with tremendous potential for further applications. In this paper, the MODIS EVI time series datasets were applied. In addition to the relatively coarse, medium-resolution (e.g., MODIS) data, the CIWJ method could also be applied to time series datasets of high spatial resolution images (e.g., QuickBird, Rapid Eye, and IKONOS). Additionally, it could be directly and easily adapted to other more general land use or specific vegetation species classifications. Besides the VI time series datasets, other time series datasets could also be utilized to represent their corresponding spatiotemporal processes.

In order to overcome the limitations of a relatively coarse spatial resolution of MODIS images, future work could be done to combine MODIS images with other higher resolution ground-truth imagery in order to quantify the sub-pixel heterogeneity (Jain et al. 2013), or directly apply the CIWJ method to VI time series datasets of high spatial resolution images. With more accessibility and availability of relatively higher spatial resolution images, the locations and areas of agricultural crops could be more accurately estimated in the near future. In addition, supplementary datasets such as the land surface water index could be applied to better discriminate grass (aquatic plants, i.e., reeds), deciduous shrubs, and other plants from agricultural crops or land uses studied in this paper.