Introduction

Herbicide application over agricultural fields has conventionally been performed using the same type and dose of herbicide in a homogeneous way. However, each field may present enormous variability in the spatial distribution and abundance of weeds. Precision agriculture, particularly early site-specific weed management (ESSWM), has a large potential because weeds are generally distributed in patches (Jurado-Expósito et al. 2003), which implies the possibility of applying control measures according to weed density or weed group (e.g., broad-leaved and grass weeds) of each patch or small-defined area (Blackmore 1996; Kropff et al. 1997). To implement ESSWM strategies, it is usually necessary to use prescription maps. One of the most successful approaches for the acquisition of crop and weed spatial information is through remotely sensed images that can be processed, classified and divided into a series of sub-plots for further adapted applications according to the specific weed emergence (Gómez-Candón et al. 2012).

Two of the most important variables in using remote imagery for mapping weeds are the image spatial resolution and the phenological stage of the crop and weeds. The higher the spatial resolution, the finer the details discriminated in the image. Regarding the phenological stage, late-season (flowering or senescent stage) weed detection maps can be used to design SSWM to apply either in-season post-emergence herbicide treatments, in case adequate pre-emergence control was not achieved, or pre-emergence treatments in subsequent years, taking into account that weed infestations can be relatively stable in location from year to year (Jurado-Expósito et al. 2004). Thus, the capacity to discriminate weeds at the advanced phenological stage could reduce herbicide use and control costs (López-Granados et al. 2006; De Castro et al. 2012, 2013). However, in most weed control situations and including ESSWM, it is generally necessary to control weeds at an early growth stage of the crop, i.e., when the crop and weeds are seedlings or with two-to-four true leaves.

The remote sensing of grass weed seedlings in monocotyledonous crops (e.g., Avena spp. or Phalaris spp. in winter cereals) and seedlings of broad-leaved weeds in dicotyledonous crops (e.g., Malva spp., Amaranthus spp. in sunflower crop) presents much greater difficulties than mapping them at a late growth stage for three main reasons: (i) cereal crops and grass weeds (and many dicotyledonous crops and broad-leaved weeds) generally have similar appearance early in the season; (ii) the distribution of weeds can be in small patches, which could indicate the necessity to work with very high spatial resolution imagery (López-Granados 2011); and (iii) the soil background reflectance interferes with detection (Thorp and Tian 2004). Accordingly, very high spatial resolution imagery (e.g., pixel ≤2–3 cm) may be needed to detect small weeds and crop plants at an early phenological stage to develop early in-season post-emergence treatments.

Recently, unmanned aerial vehicles (UAVs) have been presented as a promising tool for many agronomic applications (Schmale et al. 2008), including precision agriculture approaches (Zhang and Kovacs 2012; Torres-Sánchez et al. 2013). The development of UAVs has great potential for mapping weeds at different phenological stages (both early and late season) within the context of SSWM. The advantages of using a UAV over a piloted aircraft include the ability to fly at a very low altitude (and, consequently, the possibility to generate very high spatial resolution imagery), lower image acquisition costs than from manned aircraft, the ability to deploy an aircraft with a multi-spectral camera onboard relatively quickly, and the ability to use the camera at the right moment and repeatedly for weed detection, making possible the combination of high spatial, spectral and temporal resolutions (Rango et al. 2006).

Due to the low payload capability of most small and lightweight UAVs, there are technical problems related to differences in spatial resolution or in viewing angles from continuous images along with the same flight mission (Torres-Sánchez et al. 2013). As a result of their intrinsic characteristics, UAV images (taken at low altitude) cannot cover the whole study plot, resulting in the need to take a sequence or series of multiple images (e.g., approximately 30–100 frames ha−1). To avoid geometric distortion due to low altitude, image overlapping is usually employed so that a larger number of UAV images must be acquired for each field. These multiple overlapped images must be stitched together and ortho-rectified to create an accurately geo-referenced ortho-mosaicked image of the entire plot. This process is named geo-registration and is needed in UAV imagery processing to geo-locate targets and to create maps for further analysis and classification (Lin and Medioni 2007). The accuracy of ortho-mosaicked image co-registration is usually expressed in root mean squared error (RMSE) and is jointly dependent on the following: (i) camera internal and external orientation. The internal orientation is the camera attitude and degree of optical image distortion and can be determined by a camera calibration. On the other hand, the exterior orientation of the camera defines its location in space (including altitude above the ground) and its view direction during image acquisition; (ii) the density and distribution of ground control points (GCPs); and (iii) the topographic complexity of the scene, which controls the degree of image for-shortening and can only be adequately corrected by ortho-rectification. Thus, the quality of image registration is highly dependent on the configuration of GCP targets (Vericat et al. 2009).

Recent studies on crop-weed discrimination using UAV at early growth stages have focused on crop line detection using single imagery as a first step for further discrimination of crop and weeds. Every plant that is not located on the crop line can be assumed to be a weed, an assumption that makes inter-row weed detection more affordable (Peña-Barragan et al. 2012). Therefore, crop line mis-alignment is one of the most important starting points for ESSWM and can be considered the Euclidean distance between the axes of the same crop row at both sides of the overlapped borderline image.

As part of an overall research programme to investigate the opportunities and limitations of UAV imagery in accurately mapping weeds in winter wheat at the early season, it is crucial to explore the potential of generating accurate ortho-mosaicked imagery from multiple overlapped frames for proper discrimination of crop rows and weeds using multi-spectral cameras. Such an approach should demonstrate the accuracy of the ortho-images generated to indicate the ability of further accurate discrimination of weeds grown between crop rows to design a field program of ESSWM. Thus, this paper reports a study of the geometric accuracy of the ortho-imagery obtained from multiple overlapped images taken in wheat crops naturally infested by weeds at early stages using UAV images. The studies were focused on three parameters: flying altitude, number of GCPs and crop line mis-alignment.

Materials and methods

Locations and remote imagery acquisition

The studies were conducted in wheat crops located in the province of Seville in Andalusia (Southern Spain). Two plots of approximately 1.00 ha, named Monclova and Infantado, were sampled at the Monclova farm (Fig. 1). The geographic co-ordinates (Universal Transverse Mercator System, zone 30 North, WGS-84) of the upper left corner of the images were X = 295 627 m, Y = 4 158 205 m, and X = 295 800 m, Y = 4 157 498 m, respectively. The ground is flat (average slope <1 %). Wheat crops were sown in mid-November 2011 in rows 15 cm apart and were naturally infested by broad-leaved (Amaranthus spp., Convolvulus spp., Chrysanthemum spp., Anthemis spp., Malva spp. and Sinapis spp.) and grass weeds (Avena spp. and Phalaris spp.). The weeds and crops were at the early vegetative growth stage (2–3 true leaves, Fig. 2). At the study plots, a set of overlapped images were taken from a Microdrones MD4-1000 quadrotor UAV (Fig. 3a) equipped with a base station (Fig. 3b) and an Olympus EP-1 camera (red, green and blue bands) on the 12th and 13th of January 2012. The lateral overlap of images was 30 %, and the longitudinal overlap was 60 %. The mission planning consisted of recording the co-ordinates for the flight area in every field and then transferring the information to the UAV ground station software to generate flight rows, flight altitudes and image overlap. Over the whole field, a total of 53 artificial GCPs (formed by 0.04 m2 white square targets with an orange cross, Fig. 2) were placed using a grid of approximately 12.5 × 12.5 m, and every GCP was then geo-referenced using a Trimble Geo-XH Differential GPS (DGPS). Based on the Spanish government network of ground-based reference stations, DGPS data were post-processed, obtaining a final accuracy of 2 cm. A 1 m2 white squared frame was placed next to every GCP for weed sampling for further weed discrimination (Fig. 2).

Fig. 1
figure 1

Study area location

Fig. 2
figure 2

Wheat crop rows, broad-leaved weeds between rows, 1 m2 frame and 0.04 m2 GCP (numbered 1)

Fig. 3
figure 3

a UAV; b ground base station, right

Image processing

The exterior position and orientation parameters of the UAV, referring to the roll, pitch and yaw angles of every overlapped image, were provided by the UAV inertial system. These parameters were used as input data to the Leica Photogrammetric Suite 2010 (LPS, Leica Geosystems 2006) software for ortho-rectification by aero-triangulation and mosaicking. Aero-triangulation involves the transformation of image co-ordinates to ground co-ordinates through a set of GCPs that are clearly visible in the set of images. This step consists of forcing an exact match between image and GCP co-ordinates and a further piecemeal triangular interpolation of pixels between three tie points, being the common points between overlapped images. These manual tie points were taken by considering the corners of the frames. These tie points together with the GCP co-ordinates were also implemented in the software. Additional auto tie points were generated automatically to improve the aero-triangulation results. This procedure was needed to determine the position and orientation-corrected parameters of each image in the series of images. The camera was not calibrated prior to the flight performance. Camera calibration parameters were calculated by comparing overlapping zones of consecutive images during the aero-triangulation process. Afterwards, the images were combined into a seamless ortho-mosaicked image of each entire wheat field.

Effect of flight altitude

A series of 11, 30 and 30 images corresponding to 100, 60 and 30 m flying altitudes, respectively, were taken and processed for ortho-rectification and mosaicking using Leica Photogrammetry Suite (LPS) software to create a unique ortho-image of every wheat field. LPS assigns co-ordinates to the image using, as input, forty five out 53 GCPs placed and geo-referenced in the field. The number of GCPs used for creating every mosaic was equal to the number of single image of every series (e.g., for a 100 m flight altitude, no less than 11 GCPs can be used because 11 overlapped images were taken). Once the ortho-mosaics were generated, the RMSE was calculated to assess the accuracy of the ortho-mosaic for every flight altitude, as described below.

Effect of number of GCPs

To determine the geo-referencing accuracy of the ortho-image according to the number of GCPs, image series taken at a height of 100 m were studied in both locations. Each series was composed of 11 images and were ortho-rectified and mosaicked by using a different number of GCPs (from 11 to 45 out of 53 GCPs). Finally, the RMSE of each obtained ortho-mosaicked image was calculated, as described below.

Crop line mis-alignment

The study of crop mis-alignment at the border of two overlapped single images from each ortho-mosaicked image was performed as follows. A total of fourteen wheat crop rows were selected in each ortho-image. The crop line mis-alignment was represented by the minimum Euclidean distance between the axes of the same crop row at both sides of the mosaic inner borders, i.e. a metric-based Pythagorean Theorem, and calculated for a coordinate point by the equation (Slama et al. 1980)

$$ Mis{ - }alignment = \left[ {\left( {X_{l} {-}X_{r} } \right)^{2} + \left( {Y_{l} {-}Y_{r} } \right)^{2} } \right]^{1/2} , $$
(1)

where X l and Y l are geospatial co-ordinates of the point on the left side image, and X r and Y r are co-ordinates of the same point on the right side image. The distance was calculated by selecting the stitched point of every axis in both consecutive single images of the mosaic.

Accuracy assessment

Accuracy assessment consisted of assessing the error associated with ortho-rectification of the UAV imagery. This error is commonly expressed by the RMSE of the ortho-mosaicked image. The RMSE is usually expressed in the units of pixel size (cm in this study), is a global indicator of the quality of the mosaic, and is based on the residuals of the image co-ordinates and the ground co-ordinates. The geometric accuracy of the ortho-rectified mosaic was assessed using the co-ordinates of seven out of 53 GCPs collected on the ground using the differential GPS. Then, the RMSEs were calculated as follows: once the mosaic was generated, co-ordinates of the seven GCPs measured in the field (and not used during the mosaicking process) were compared to the co-ordinates of these seven GCPs in the mosaicked image using ENVI image processing software (Research System Inc., Boulder, CO, USA). Finally, the differences between DGPS co-ordinates and co-ordinates from the mosaicked image were used for calculating the RMSE in the units of the pixel size (cm).

The RMSE for an image with n validation points is assessed as follows (ERDAS 1999):

$$ RMSE = \left[ {\sum\limits_{i = 1}^{n} {\left[ {\left( {X_{s} - X_{r} } \right)^{2} + \left( {Y_{s} - Y_{r} } \right)^{2} } \right]/n} } \right]^{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-0pt} \!\lower0.7ex\hbox{$2$}}}} $$
(2)

where Xs and Ys are geospatial co-ordinates of the point on the source image, and Xr and Yr are co-ordinates of the same point DGPS measured on the terrain.

Results and discussion

All image series were taken at both locations near solar noon over two consecutive days. Weather conditions were very similar in both image series. The only notable difference was that the weather at the Monclova plot was slightly cloudy when the images were taken, while it was rather sunny at Infantado. This difference is important for repeatability studies, where imagery has to be acquired for the same sites at different times.

For both locations, the UAV image series at the three different flight altitudes were taken and ortho-mosaicked. The results are shown in Table 1. The final spatial resolutions of the ortho-mosaics were 2.47, 1.48 and 0.74 cm for flight altitudes of 100, 60 and 30 m, respectively. The overall RMSEs were very similar between ortho-mosaics regardless of the flight altitude and ranged from 0.21 to 0.29 cm and 0.08 to 0.28 cm for Monclova and Infantado, respectively. Thus, the results did not show large differences in accuracy between the intervals of altitude studied (30–100 m) and, in every case, the RMSE was less than 0.30 cm. To corroborate these observations, an analysis of variance was carried out between altitudes and localities and no significant statistical differences were found. Therefore, one of the relevant results of the study is that flying altitude did not appear to affect the georeferencing accuracy of the mosaicked image. Thus, taking into account the pixel size obtained in the imagery, the RMSE was less than 1 pixel in every case. According to Laliberte et al. (2010) with aerial photography, an RMSE of 1 pixel or less is desirable, but such an RMSE is difficult to achieve with UAV imagery. They reported that errors of 1.5 to 2 pixels from the aerial triangulation could be acceptable for UAV imagery acquired with low-cost cameras, although they only discuss the relevance of RMSE associated to their aerial triangulation analysis without considering the GPS error. However, together with the RMSE, the GPS accuracy could also affect the geo-registering and mosaicking. In the study and considering a DGPS error of 2 cm, the 2.47, 1.48 and 0.74 cm spatial resolutions achieved for 100, 60 and 30 m altitudes, could incorporate increments of 0.81 (obtained from DGPS error/2.47 cm spatial resolution), 1.35 (DGPS error/1.48 cm spatial resolution) and 2.70 (DGPS error/0.74 cm spatial resolution) pixels to the RMSE error, respectively. Consequently, the total errors obtained (i.e. RMSE plus DGPS error) would be 0.91, 1.52 and 2.90 pixels for the 2.47, 1.48 and 0.74 cm spatial resolutions, respectively. According to this slight increment of error and following the criteria of Laliberte et al. (2010), the results presented in this paper are expressed considering only the RMSE. An example of one of the resulting ortho-mosaics of the 100 m height series at Monclova is shown in Fig. 4.

Table 1 RMSE of Monclova and Infantado plots at different flying altitudes
Fig. 4
figure 4

Example of the resulting orthomosaics of the 100 m altitude flight at Monclova; 48 white 1 m2 frames are visible

Regarding the effect of the number of GCPs employed for mosaicking, the mean RMSEs were between 0.25 and 0.17 cm for the 11 to 45 GCPs (Table 2). In every case, the RMSE was less than the pixel size (2.47 cm). Therefore, although there were slight differences in RMSE, they can be considered insignificant and irrelevant compared with the magnitude of the pixel size of the image and the target crop (wheat with a crop row approximately 15 cm wide). Some studies have demonstrated that an increase in the number of GCPs would remarkably improve the georeferencing accuracy (Gómez-Candón et al. 2011). However, the results showed that the RMSE improvement was negligible for the studied image series. These differences can be explained by the following: (i) the lower covered area (1 ha compared to 80 ha), (ii) a higher spatial resolution (2.47 cm compared to 10 cm), (iii) the use of image series instead of one single image, and (iv) the higher proportion of GCPs per area (up to 45 GCPs per 1 ha compared with 23 GCPs per 80 ha).

Table 2 Overall RMSE of Monclova and Infantado plots at 100 m altitude according to the number of GCPs used for the mosaicking process

Discarding the altitude and number of GCPs as a cause of error, there are other factors that could have been involved in the georeferencing accuracy, such as the use of accurate Digital Elevation Models (DEMs), the DGPS accuracy itself, and the sensor and aerial platform characteristics (Mostafa et al. 2001; Lunetta et al. 1991). The study plots were relatively flat, with a slope less than 1 %. Consequently, the terrain effect was not appreciable, and no DEM effect was studied. However, the implementation of an accurate DEM could be a further research line to assess whether the slope could be an important source of variability.

An example of crop line mis-alignment is shown in Fig. 5. The crop line mis-alignments were 4.60, 1.15 and 1.45 cm for the 100, 60 and 30 m images, respectively (Table 1). This result means that any mis-alignment was less than twice the spatial resolution of the mosaic (2.47, 1.48, and 0.74 cm for the 100, 60 and 30 m images, respectively). At the time of image acquisition, the wheat crop was at an early vegetative growth stage (2–3 true leaves) and every wheat row was approximately 15 cm wide. Thus, a mis-alignment of 4.60 cm did not break the crop line continuity. If notable differences in crop line alignment would have occurred, crop rows of single overlapped images may not fit perfectly. Consequently, crop alignment could be broken and further crop-weed discrimination hindered. Crop row detection requires that an accurate row-to-row correspondence be found between overlapped images of the input sequence. This is relevant and one of the most important starting points for ESSWM since the definition of the row structure formed by the crop is essential for further identification of plants (crop and weeds) because the position of each plant relative to the rows might be the key feature used to distinguish between the weeds and crop plants. Therefore, crop mis-alignment must particularly be avoided in crops sown with very narrow rows, as happens in winter cereals (e.g., wheat, barley, rye), which are usually sown in rows 15 cm apart. Hunt et al. (2010) worked in wheat fields to measure crop leaf area index; however, the images over the wheat field were analysed one-by-one and not combined into a mosaicked image. Their procedure provides field information but does not allow information to be obtained in the whole field for ESSWM.

Fig. 5
figure 5

Mosaicked overlapping area. The dividing line between single images is marked in red, wheat crop row axes in yellow and crop row misalignment in blue (Color figure online)

At 100 m flight altitude, there were no mis-alignment differences, regardless of the number of GCPs used, and the average crop mis-alignments were 4.60 cm for all GCPs. Therefore, as stated in Table 1, the flight altitude affected crop alignment, but not the number of GCPs. This result can be explained by the tie points being more involved in the mosaicking process, whereas the GCPs are more useful for the georeferencing process. To corroborate this statement, the effect of the number of tie points will be taken into account in future studies.

It is notable that the time to perform each ortho-mosaic was about 4 h of laboratory work and it was dependent on the number of single overlapped images rather than the area covered by every flight. This time has to be added to the process of image analysis length (i.e. to obtain weed maps on wheat crops). The balance between total length of the process and the time available for decision making will determine the whole method.

Conclusions

Flight altitude is an important parameter to take into account when acquiring remote images using UAV. However, there were no georeferencing RMSE differences in the ortho-mosaics created by UAVs flying between 100 and 30 m heights. For this reason, when selecting the right flight altitude, there are two parameters more crucial than image errors: the first one is the optimum spatial resolution needed to discriminate between weeds and crops, and the second one is the number of single images needed to include the entire area, as a high number of single images make the ortho-rectification and mosaicking processes more difficult since a very large number of images per hectare strongly increases at very low altitudes following an asymptotic curve. In addition, the operation timing is limited by the UAV battery duration. All these variables have major implications for weed mapping in early season which, according to the results, involve two main conditions: (1) to provide remote images with a fine spatial resolution to guarantee weed discrimination, and (2) to minimise the operating time and the number of images to reduce the limitation of flight duration and image mosaicking, respectively.

Regarding the number of GCPs, the RMSE was less than the pixel size in every case. Thus, the RMSE obtained using one GCP per single image of the series (11 GCPs per 11 images taken at a 100 m altitude) was satisfactory, and the RMSE differences did not justify the use of more GCPs.

The mosaicking method studied is satisfactory to maintain the needed wheat crop alignment required for ESSWM, so no further improvements would be necessary. The images obtained were appropriate for crop line detection and further weed detection.

The methodology herein presented could be used for mosaicking a range of small to large areas depending on the autonomy of the UAV. This advantage, together with the high temporal resolution and the ultra-high spatial resolution obtained within a range of 2.47 to 0.74 cm, would allow a large extension of detail in the extracted information of the images for weed patch detection, which is the final objective of the research herein presented.

In this work, the studied plots had a flat terrain (<1 % slope), so no DEM was used. However, the ultra-high spatial resolution required may imply the use of very accurate DEMs in case higher slopes are present in the field. The implementation of accurate DEMs in the mosaicking process will be a line of research in future studies. Moreover, the effect of the number of tie points generated was not studied; however, it could be relevant for improving image alignment in uneven areas. This effect will also be considered in future works.

As conclusion, the most important achievement of this paper was to get accurate georeferenced mosaics in wheat fields at seedling stage with very high spatial resolution for further use in ESSWM using a low cost camera. The altitude of flight (especially important for image pixel size) and exact number of GCP required for a good quality mosaic needed to be established for further work. As stated in the paper, flight altitude determines the image spatial resolution, but also the number of images per hectare since the lower the altitude, the larger number of images must be taken and the lower amount of surface is captured by any overlapped image. Both elements affect clearly the image mosaicking procedure and also the operation timing limitations due to UAV battery duration. Similarly, the lower the amount of GCPs required, the higher the optimization of field sampling that can be obtained. Therefore, according to the objective of harmonizing ultra-high spatial resolution and minimizing the operating time and the number of images taken, the optimum flight mission must be to capture images at the highest possible altitude.