Introduction

Machine vision systems onboard autonomous tractors, are useful tools for crop row detection in wide-row crops (Emmi et al. 2014; RHEA 2014). Crop row detection is crucial for site-specific treatments (Gée et al. 2008; Montalvo et al. 2012; Romeo et al. 2012; Guerrero et al. 2013). In such vehicles, the navigation is based on Global Positioning System (GPS), although GPS technology provides good accuracy (Emmi et al. 2014) for precise guidance, when small deviations occur, crop row detection is also crucial for correction (Kise and Zhang 2008; Rovira-Más et al. 2003).

Outdoor agricultural environments are affected by uncontrolled lighting conditions (sudden shadows, excessive or poor illumination) affecting the image quality. Also, gaps can be present in the crop rows due to lack of germination, defects during planting or pests/disease. High weed density with similar spectral signature to crops may appear in the inter-row spaces and close to the crop rows producing false widths in the imaged crop rows and hence fake crop rows during detection. Different plant heights and volumes (due to several growth stages) can also create severe problems. Curved crop rows with irregular inter-row spaces, together with terrain side slopes, movements of the tractor in irregular terrains or due to deviations during navigation, add another degree of difficulty. Curved crop rows could be common in some fields due to the land topography. These do not only appear in agricultural terraces, but also in flat parcels of lands with irregular geometry.

Different imaging-based approaches have been conducted for detecting crop rows; some of them combine two or more techniques and all apply prior knowledge mapped as different constraints (number of crop rows to be detected, camera system geometry, inter-row space, expected crop row positions and orientations, intrinsic and extrinsic camera parameters, vanishing point or imaging perspective projection). The following is a list of methods and categories:

  1. (a)

    Hough transform (HT): initially proposed by Hough (1962) is based on computing accumulations of counts corresponding to crop rows. HT was applied for straight crop row detection in several wide row crops (cauliflowers, sugar beet and wheat) by Marchant (1996), Astrand and Baerveldt (2005), Bakker et al. (2008, 2011), Rovira-Más et al. (2005), Han et al. (2012), Tellaeche et al. (2008a, b). Modifications of the HT have also been applied by Ji and Qi (2011) with a gradient-based Random HT and Gée et al. (2008) with a Double HT. Leemans and Destain (2006) applied previous knowledge with respect to crop row position and direction in order to restrict the search in the Hough space. Jafri and Deravi (1995) developed a method for detecting parabolic curves reducing the size of the accumulator from 4D to 3D for parabolas with different orientations. Inspired by the idea of HT, accumulation of green pixels along lines of exploration where crop rows are expected were the approaches applied by Olsen (1995) and Romeo et al. (2012).

  2. (b)

    Linear regression: estimations of slope (m) and intercept (b) are obtained for straight crop rows with equation y = mx + b. Least squares is the common method for fitting, sometimes combined with other strategies. Cotton, cereal and maize are crops where this technique has been applied (Billingsley and Schoenfisch 1997; Sogaard and Olsen 2003; Tillett and Hague 1999; Hague et al. 2006; Montalvo et al. 2012; Guerrero et al. 2013; Xue and Ju 2010; Vidović and Scitovski 2014).

  3. (c)

    Horizontal strips: the image is split into horizontal clusters (multi-Regions Of Interest) of wide row crops (wheat, maize, soybean), where the gravity centers determine the crop rows considering their alignments and fitting straight lines by least squares or applying the HT (Sogaard and Olsen 2003; Fontaine and Crowe 2006; Sainz-Costa et al. 2011; Burgos-Artizzu et al. 2011; Jiang et al. 2015).

  4. (d)

    Vanishing point: parallel straight crop rows in the 3D scene under perspective projection are imaged intersecting at this point and only these rows are considered in the image (Pla et al. 1997) or combined with the HT (Romeo et al. 2012; Jiang et al. 2016).

  5. (e)

    Filtering: parallel crop rows in the 3D scene show specific features and patterns after filtering. Hague and Tillett (2001) applied band-pass filtering in the frequency domain. Bossu et al. (2009) and Vioix et al. (2002) applied wavelets and Gabor filtering in the spatial domain.

  6. (f)

    Stereo vision: crop row location and elevation maps were obtained from two stereo-cameras (Kise and Zhang 2008), sometimes combined with an inertial measurement unit (Rovira-Más et al. 2008). Kise et al. (2005) detected a point per image for guiding navigation in both straight and curved paths.

  7. (g)

    Regular patterns: template matching with dynamic programming for optimization was applied by Vidović et al. (2016) for straight and curved crop row detection in different crop types (maize, celery, potato, onion, sunflower and soybean) and growth stages. This method can work with an a priori unknown number of crop rows and unknown field geometry. However, it assumes that all the crop rows are equally spaced for each image row. The images originally captured were resized to 320 × 240 pixels in order to reduce the required computation time without significant loss of information. A major resolution (e.g. 640 × 480) increases considerably the execution times, even when a compiled programming language (e.g. C++) is used. Moreover, the performance diminishes on some top crop rows of the image due to the perspective projection (the repetitive pattern is lost on this part of the image) and the significant presence of grass in the image.

According to the above considerations, a new global strategy was designed for detecting both straight and curved crop rows in wide row crops, under the above mentioned adverse environmental conditions and even with irregular inter-row spaces. This is the main contribution of the paper, achieving similar or better performances than other existing strategies proposed for straight and curved crop row detection with regular inter-row spacing, achieving clear improvements in irregular inter-row spaces as well as straight and curved crop rows co-existing in the same image. The proposed strategy exploits the performance of some partial procedures involved in the above referenced methods and includes new procedures integrated in the global strategy to achieve a valid procedure for the maximum number of situations in crop row detection. The main considerations were:

  1. (a)

    The monocular camera (no stereo vision) system geometry and number of crop rows is known. However, they can vary depending on the machine vision system (sensor and optical system) selected.

  2. (b)

    The use of multi-ROIs limits the search space in the image, so that the HT works appropriately.

  3. (c)

    Linear regression is applied for the best fitting of linear/quadratic polynomials to the crop rows, with the assumption that both kinds of crop rows could appear in the same image. This implies that no regular patterns or filters were considered for capturing specific features based on parallel crop row alignments. Moreover, the linear regression does not need a previous estimation about the crop row locations for posterior adjusting, unlike in Guerrero et al. (2013).

Materials and methods

The images used in this study were captured during February and March 2015 on an experimental field of maize at the San Francisco Research Station, Tulcán-Carchi, Ecuador, located 2787 m above sea level (Latitude 0.62ºN and Longitude −77.75ºW), covering an area of two ha. The terrain is irregular with slopes of up to 12°. All acquisitions were spaced approximately six days apart over 45 days, i.e. they were obtained under different environmental lighting conditions and different growth stages of maize and weed plants. Figure 1 shows illustrative examples of the images captured on the field. In (a) low presence of weeds, in (b) different shadows are projected on the ground, increasing the degree of difficulty for the image processing with respect to the greenness extraction, (c) different sizes of maize plants and presence of gaps, with lengths up to 1.20 m in the same crop row. The gaps were due to errors during the sowing or perhaps because the maize seeds had not emerged yet. Furthermore, images acquired under different lighting conditions are shown in (d) clear day, (e) cloudy day and (f) sunny day. The crop rows were spaced 0.85 m, with variations ranging from 0.75 to 0.95 m, particularly in curved crop rows.

Fig. 1
figure 1

Illustrative examples of the images captured in the field; a with a low level of weed; b presence of shade; c presence of gaps and plants of different sizes. Images acquired under different lighting conditions: d clear day; e cloudy day; f sunny day

The images were obtained with a goPro Hero 3+ black edition color camera, model CHDHX-302 and Manufactured by Woodman Labs, Inc. (USA, California). This device is equipped with an image sensor size 1/2.3″, pixel size 1.55 µm and focal length of 3 mm. The camera was mounted on the front of a New Holland TD90 tractor moving at an average speed of 3 km/h (~0.83 m/s). The machine vision system arrangement onboard the tractor was established at a height of 2 m from the ground and with the optical axis inclined 45º with respect to the ground (pitch angle) and without lateral displacements (i.e. with roll and yaw angles set to 0º). This arrangement together with the focal length and the sensor resolutions allow determining the correspondence of distances and areas in the 3D scene to pixels on the image by applying a simple transformation between co-ordinate systems.

The digital images were captured under perspective projection and stored as 24-bit color images with resolutions of 3000 × 2250 pixels (7 Mpx) saved in RGB color space in the JPG format. Nevertheless, only a reduced area on the ground is of interest either for applying site-specific treatments or as reference for guiding, it is the Region of Interest (ROI). The size and location of the ROI must be specified considering (i) the number of crop rows to be detected and (ii) the imaged ROI that contains enough resolution (in pixels) to identify unambiguously green plants (crop and weed) from soil and also with sufficient points to estimate the equations that define the crop rows. In the proposed approach, four crop rows were selected for detection, which is the number used in the RHEA (2014) project and reported by Emmi et al. (2014) because the implements had four couples of burners, each couple acting on a crop row. Considering the four crop rows and the inter-row space of 0.85 m, the total width of the ROI is 3.4 m. The ROI was set to start at 3 m in front of a virtual vertical axis traversing the center of the image plane in the camera to avoid other elements in front of the tractor being imaged. The length of the ROI was fixed at 5 m to provide sufficient image resolution as explained above and based on camera system arrangements (Pajares et al. 2016). The ROI area in the 3D scene (field) is mapped under perspective projection in an image resolution of 2000 × 650 pixels (width × length), which represents approximately the 20% of the full original image and is always located at the same position inside the image. Figure 2(a) shows the ROI enclosed in the rectangle, which corresponds to the imaged 3D area in the maize field. Nevertheless, the method can be easily adapted for detecting any number of crop rows with the corresponding camera setting. Considering the 5 m length of the ROI and the 0.83 m/s speed, the tractor takes about 6 s to travel that distance when a new image, with the next consecutive ROI, is acquired. The image processing time plus the time spent to activate the implements, when required, must be below 6 s to consider real-time processing.

Fig. 2
figure 2

a Location of the ROI in the original RGB image as a rectangle (2000 × 650 pixels); b identification of the soil (dark), weed and crop (light) in the ROI

The images were processed by Matlab from MathWorks (2015), release 8.5 (R2015a), using an Intel Core i7 2.0 GHz processor, 8 GB RAM and Windows 8.1 Pro operating system (64-bits). The proposed algorithm was developed using the graphical user interface development environment (GUIDE) incorporated into Matlab.

Image processing method

The proposed image processing procedure was designed to consist of three main phases: (i) image segmentation, (ii) identification of starting points and (iii) crop row detection. Figure 3 shows the full structure, including the flow chart. This process is illustrated with a representative image showing the results at the different steps.

Fig. 3
figure 3

Proposed image processing method architecture

Segmentation

Because of the need to detect curved crop rows, it is crucial to identify crop plants, with the maximum precision as possible. The precision obtained by Montalvo et al. (2013) achieved around 93% crop identification. This value stablishes a lower limit. So, a detection rate above the 93% is necessary in order to distinguish crop from weed that also have similar spectral signatures. This is because, unlike in methods designed for straight crop row detection, there is no prior information or applicable geometric constraints for estimating crop row positions. Furthermore, crop row discontinuity, together with the random distribution of weeds, reinforces the need to achieve the maximum discrimination level in crops as possible, with the limit of 93% established above. In this regard, the double thresholding proposed by Montalvo et al. (2013), had improved performance against other existing strategies involving supervised learning (Guerrero et al. 2012). Indeed, the double thresholding is dynamically self-adjusting, without learning, to the highly changeable environmental conditions common in agricultural tasks. In the experimental area there exist invasive weeds (e.g. Brassica campestris, Pennisetum clandestinum, Rumex crispus, Calendula arvensis), including some species with green-reddish tones (e.g. Polygonum nepalense). Thus, the double thresholding is well suited for distinguishing these invasive weeds from crop plants. This distinction is of special relevance for detecting curved crop row. Under these considerations, the segmentation phase was designed with the following four subsequent processes:

  1. (a)

    ROI determination, as specified above. Figure 2a shows a ROI enclosed in the rectangle.

  2. (b)

    Greenness identification, the ExG index (Woebbecke et al. 1995) was selected because of its performance after exhaustive studies with different indices (Guijarro et al. 2011) without apparent improvements. Figure 4a shows the resulting grayscale image after applying ExG to the ROI.

    Fig. 4
    figure 4

    a Greyscale image after applying ExG; b binary image after applying the Otsu-based double thresholding; c binary image after applying the morphological operations

  3. (c)

    Double thresholding, based on the Otsu’s method (Otsu 1979), where the first thresholding discriminates green plants and soil and the second, crops and weeds, once it is applied on the discriminated green plants. Figure 2b shows weeds and crops in the ROI and Fig. 4b the resulting binary image after thresholding and once weeds have already been identified. However, some weeds may still remain due to high similarity in spectral signature to crops, Fig. 2a.

  4. (d)

    Morphological operations, opening (Onyango and Marchant 2003) and majority operations (size 3 × 3) are applied to remove insignificant small patches and spurious pixels over the binary image; Fig. 4c shows the resulting image. A diamond-shaped structuring element for opening with a distance of 2 pixels from the structuring element origin to the points of the diamond suffices.

Identification of starting points

The ROI concept with sub-ROIs and the HT were applied here to determine significant points on the base of the ROI, Fig. 5, which determine the starting points for the crop rows. The HT was used because it is a strong technique in the presence of noise or when there are hidden or incomplete parts in the objects (Cuevas et al. 2010).

Fig. 5
figure 5

Location of the starting points at the bottom strip of the ROI (up to half: horizontal line). Each starting point is made up of a cross point (circle) at the base of the ROI and the slope of a straight (sloped line)

Given the binary image, obtained as described in the previous phase containing white pixels that belong to crops and weeds (not identified above, Fig. 2a), a set of four starting points was identified to find the crop rows as follows:

  1. (a)

    The ROI is divided into two horizontal strips or sub-ROIs of equal size (top and bottom), Fig. 5. The underlying idea is that, with this division, curved and straight crop rows can be approximated by piecewise linear segments into the bottom strip.

  2. (b)

    The HT was applied to the bottom strip for identifying pixel alignments that represent the expected four piecewise linear segments. The polar representation (Gonzalez and Woods 2010) was used (ρ = x cos θ + y sin θ), where ρ is the distance of the straight line to the origin and θ is the angle that forms the normal with the x-axis. To convert the Hough parameters (ρ, θ) to the parameter space of the image with slope (m) and intercept (b), the following equations were used:

    $$m = - \frac{\cos \theta }{\sin \theta },\quad b = \frac{\rho }{\sin \theta }$$
    (1)
  3. (c)

    Four peaks are identified in the Hough polar space, which determine four m and b parameters associated with four straight lines, which are drawn on the bottom strip on the ROI verifying that they cross both the lower and upper lines in this sub-ROI (Fig. 5). The four lines must contain differences in ρ greater than 300 pixels. This value was obtained considering that the ROI is 2000 pixels width with 500 pixels per crop row on average, i.e. representing a margin of tolerance close to the 40% of the distance in pixels between crop rows, which is the reference percentage value when a different number of crop rows is to be identified. Prior contextual knowledge was also applied restricting the angle θ to range in [−45°, 45°] with sufficient flexibility to consider important crop row inclinations with respect to the horizontal base of the ROI. The HT accumulator was designed with resolutions of 1 pixel and 1° respectively in polar co-ordinates. The intersection points between the detected straight lines with the lower line in the sub-ROI determine the four starting points located at image co-ordinates (x, y) at the base of the sub-ROI. Figure 5 shows four starting points and four piecewise linear segments (reference lines). The slope (m) of each line indicates the direction from which to explore the plants within the bottom strip of the ROI. However, the high presence of weeds and gaps can cause errors in the detection of starting points. Thus, if the number of starting points is less than four, the image is rejected and then a new image is captured and processed.

Crop rows detection

Curved and/or straight crop rows are detected according to the following steps: (a) extraction of candidate points from reference lines; (b) regression analysis for fitting polynomial equations (straight/quadratic) and (c) final crop row selection and verification.

Extraction of candidate points for the crop rows

After the process applied to obtain binary images, like the one shown in Fig. 4c, an extra morphological dilation process is applied to expand the segmented plants (crops and weeds). The structuring element used is a horizontal line 20 pixels in length (parameter w = 20). Dilation can fill gaps in the horizontal direction and the plants appear with a greater amount of green mass. The size of 20 pixels represents an expansion of about 35 mm of plants at the base of the ROI (2000 pixels in 3.4 m) at the 3D scene in the field (not on the imaged ROI) as explained above, which is a suitable value for the set of images tested. Figure 6a shows the result obtained after applying this step on the image from Fig. 4c. Depending on the growth stage of plants, this value can be decreased or increased when weed pressure is high or low respectively.

Fig. 6
figure 6

a Binary image after applying the dilation morphological operations using horizontal line of size 20 pixels as structure element. b The ROI is divided into four horizontal strips of equal size. These strips lead to 5 horizontal lines (including upper and bottom edges of the ROI) labeled from 1 to 5 from bottom to top of the ROI

The full ROI on the dilated image is split into 4 horizontal sub-strips of equal length, obtaining five lines (including upper and bottom edges of the ROI). This number of lines has been established after trial and error for the set of images used for experimentation. Figure 6b shows the five horizontal lines labeled from 1 to 5 from bottom to top in the ROI.

The expected crop rows are sequentially extracted from left to right (labeled from 1 to 4). For each expected crop row, a set of 5 points distributed along potential crop row alignments was obtained. These points are to be located on the horizontal lines labeled with 1 to 5. Points 1 to 3 at the bottom of the ROI (Fig. 7) are obtained by computing inter-sections between the corresponding reference straight line (Fig. 5) and the three horizontal lines (labeled as 1-3) that divide the ROI (Fig. 6b). The remaining two points at the top of the ROI were obtained as follows:

Fig. 7
figure 7

Part of the ROI showing the first crop row. Points 1 to 3 are obtained by considering the intersections between the line (sloped) and the three horizontal lines (labeled as 13) respectively

  1. (1)

    The point B placed at the top edge in the ROI (horizontal line #5) is obtained by extending the reference straight line (points 1, 2 and 3), Fig. 8.

    Fig. 8
    figure 8

    Part of the ROI showing the first crop row. The point B placed at top edge of the ROI (horizontal line #5) is obtained by extending the sloped line (points 1, 2 and 3). The point A is obtained by shifting B to the left along the x co-ordinate. Four points A, 3, 2 and 1 are used for fitting a parabola by least squares

  2. (2)

    The point A, also placed at the same line #5, Fig. 8, is obtained by shifting the point B to the left of the image a number of pixels along the x-coordinate. In this case, the value has been established in 200 pixels by experimentation and corresponds, on average, to 0.67 m in the 3D scene at this part of the ROI in the maize field (not on the corresponding imaged area), representing the 80% of the inter-crop row space (0.85 m).

  3. (3)

    From point A to B do (external loop):

    1. (a)

      A parabola was fitted by applying least squares using the following four points: A, 3, 2, 1 (Figs. 8, 9).

    2. (b)

      Point C onto the horizontal line #4 was obtained at the intersection between the previous fitted parabola and line #4, Fig. 9.

      Fig. 9
      figure 9

      Part of the ROI showing the first crop row. Point C is obtained at the intersection between parabola (fitted by least squares using A, 3, 2, 1) and horizontal line #4. The point D placed at horizontal line #4 is obtained by adding a threshold (66 pixels) to the x co-ordinate of point C

    3. (c)

      Point D placed at same line #4 was obtained by adding a threshold to the x-coordinate of point C (Fig. 9). This value was set as a third of the previous displacement between B and A, i.e. 66 pixels corresponding on average to 0.16 m in the 3D scene at this part of the ROI.

From point C to D do (internal loop):

  1. (d)

    A parabola (line drawn on the upper half in Fig. 10a) was fitted by least squares using five points: A, C, 3, 2, 1. The number of white pixels along the curved segment \(A\bar{C}3\) of the fitted parabola was obtained and points A, C stored.

    Fig. 10
    figure 10

    Part of the ROI showing the first crop row. a A parabola (drawn on the upper half) is calculated by least squares using five points: A, C, 3, 2, 1. b Four parabolas (drawn up to half) are calculated. Point C changes (increases) at each iteration

  2. (e)

    The x co-ordinate of point C was increased by a threshold set to 20 pixels. This value matches the length of the horizontal structuring element used during the dilation operation applied above in Fig. 6a (parameter w).

  3. (f)

    Steps (d) and (e) were repeated until the internal loop ends. The internal loop is executed four times, producing four parabolas as shown in Fig. 10b. The four parabolas were estimated considering the five points (A, C, 3, 2, 1), thus they do not match exactly and do not cross these points necessarily.

  4. (g)

    The x co-ordinate of the point A was increased by a threshold set to 20 pixels as before (parameter w).

  5. (h)

    Steps from (a) to g) were repeated until the external loop ends. The external loop was executed 10 times and two points A and C (two squares in Fig. 11a) were identified at the top of the ROI (lines #5 and #4 respectively) where the curved segment \(A\bar{C}3\) had the greatest accumulation of white pixels. A number of 50 white pixels was set by experimentation as the minimum amount to be considered as a segment belonging to a crop row. Otherwise, the segment is assumed as large gaps and the image is rejected, then a new image is captured and processed. A set of five significant points (labeled from 1 to 5 in Fig. 11b) distributed along the crop row were identified and stored.

    Fig. 11
    figure 11

    Part of the ROI showing the first crop row. a Remaining two points A and C (two squares) are identified at top of the ROI (line #5 and 4 respectively). b A set of five significant points (asterisks labeled from 1 to 5) distributed along the crop row are identified

The full procedure (steps 1 to 3a–h) was repeated for the remaining starting points. Figure 12 shows the result of this process (sets of asterisks).

Fig. 12
figure 12

A set of five significant points (asterisks) distributed along each crop row within the ROI are identified

Regression analysis

Once the set of five significant points, along each crop row, were obtained (asterisks in Fig. 12) according to the previous process and containing the corresponding starting point, polynomials of degree one (straight line) and two (quadratic curve) were fitted by least squares technique. The parabola defined by these five significant points distributed along the crop row is adjusted to crop rows in the tested images. Five points proved sufficient to define a curve and adjust properly to the curvature of the crop row. For straights lines, the coefficients to be estimated were the slope (m) and the intercept (b); for the quadratic polynomials, the coefficients estimated were a, b and c, Eq. 2.

$$y = mx + b; y = ax^{2} + bx + c$$
(2)

Figure 13a, b show the graphics of the two fitted polynomials (straight and quadratic) to each crop row in the ROI. In (a) the polynomials are drawn on the binary image and in (b) on the original color image.

Fig. 13
figure 13

Graphics of the two fitted polynomials (straight and quadratic) to each crop row within the ROI. a Polynomials drawn on the binary image; b polynomials drawn on the original color image; c detected crop rows that fitted better within the ROI (curved rows)

The least squares technique also gives the norm of residues that is a measurement of quality of the fitting. The lower the norm, the better is the fitting. The norm of residues, Eq. 3, is defined as the difference between the experimental values and the ones predicted by the model.

$$R = \sqrt {\mathop \sum \limits_{i = 1}^{n} (x_{i} - \hat{x}_{i} )^{2} }$$
(3)

where n is the number of points, x the experimental values and \(\hat{x}\) the predicted ones by the model.

Final crop row selection and verification

Two steps were performed in this process: (i) the selection of rows and (ii) the verification of the selected rows. In the first step, the polynomial (either straight or quadratic) that best fits each crop row was selected. The selection was carried out according to the minimum value of the norm of residues (R), Fig. 13c shows the four curved crop rows selected from Fig. 13a.

In the second step, a validation of the previously selected polynomials was performed. Two things are verified: (i) separation between rows, and (ii) row orientation. In the first case, two consecutive rows must maintain a known distance of separation at the bottom of the ROI. A greater separation distance expresses some anomaly. Perhaps the seeds sown in the furrows have not fully germinated. Figure 14a shows the distance between two points at the bottom of the ROI. Based on this assumption, the verification rule can be expressed as follows: if the distance is greater than a threshold, the whole image is rejected, otherwise it is accepted. In this study, the threshold was established by experimentation to be 600.

Fig. 14
figure 14

Verification process for the detected crop rows; a the rows are too separated due to the absence of plants in the central crop row; b the rows cross in the ROI due to gaps and also high level of weed. In both cases, the whole image is rejected

In the second case, two detected rows should not cross each other within the ROI or when they are extended outside the bottom ROI. Figure 14b shows two rows intersecting. This situation appears due to the high weed pressure. The crossing point (\(X_{cross} ,Y_{cross}\)) is obtained by equating two straight line equations and the verification rule is as follows: if y co-ordinate (\(Y_{cross}\)) is less than or equal to zero, the whole image is accepted, otherwise rejected.

In general, if any of the previously described anomalies are present, the entire image is rejected, otherwise it is accepted. As a final result, the automatic method gives the image of maize crop with four detected curved and straight rows and mathematically modeled. As noted above, this number is sufficient for automatic vehicle guidance and is also convenient for destroying weeds that are located on the inter-row spaces, according to the ideas of the RHEA (2014) project.

Regarding left/right concavity in curved crop rows, the method was designed for curved crop rows with concavity oriented towards the left, Fig. 13c. It is useful when the tractor is moving forward and crop row concavity is oriented towards the left. However, when the tractor comes to the end of the field and starts to move back, crop row concavity will be oriented towards the right. In this case, the method works as follows:

  1. (a)

    A vertical specular reflection process is applied on ROI with curved crop rows oriented towards the right (Fig. 15a) obtaining a new ROI with crop row curvature oriented towards the left (Fig. 15b). Namely, the last column of the ROI becomes the first one, penultimate column becomes the second, antepenultimate column becomes the third and so on.

    Fig. 15
    figure 15

    a Example of ROI with curved crop rows oriented toward right; b specular reflection with crop rows oriented toward left

  2. (b)

    Crop rows are detected on curves oriented towards the left by using the proposed method as explained above obtaining five points on each crop row (Fig. 12).

  3. (c)

    A horizontal translation operation is applied to each point on curved crop rows by using Eq. 4 obtaining five new points for each crop row.

    $$P(i,j)\mathop \to \limits_{translation} \hat{P}(i, columns - j + 1)$$
    (4)

    where columns value represents the number of columns of the ROI (columns = 2000).

  4. (d)

    A straight or quadratic curve is fitted by least squares (Eq. 2) using the new translated points (\(\hat{P}\)) for each crop row. These curves are oriented towards the right as expected.

Results and discussion

The images used for testing belong to maize crops containing curved and straight crop rows as explained above (Fig. 1). Different levels of weed infestation and gaps have been considered in the sets of images. First case, the proposed method was tested with a low level of weed, i.e., up to 5% of coverage according to the classification scale proposed by Maltsev (1962). The percentage of weed coverage was obtained by applying the quadrant method as follows:

$$\% weed\; coverage = \frac{{A_{weed} }}{{A_{sampled} }}*100$$
(5)

where A weed is the area that is covered by weeds and A sampled is the sampled area, i.e. 0.5 m × 0.5 m = 0.25 m2.

Higher level of weed (>5%) could cause incorrect crop row detection or deviations regarding the expected furrows. As mentioned before, when this drawback is detected by the method, the image is rejected, Fig. 14b. Low, medium and high weed pressure means respectively in the ROI: less than 5%, between 5 and 12% and between 12 and 25%. Weed pressures above 25% lead to a high probability of failure. The 5% is a limit derived from the rating scale proposed by Maltsev (1962).

Second case, different levels of gaps: low = 1 gap up to 0.40 m, medium = 2 gaps up to 0.80 m and high = 3 gaps up to 1.20 m. More than three gaps (>1.20 m) were not considered because the probability of failure is greater than 50% with the camera system geometry and specifications used in the experiments, where the ROI represents 3.4 m × 5 m (width × length). Both, gaps and high weed pressure affect significantly the performance of the proposed method on curved and straight crop rows, mainly in the first ones.

Moreover, the minimum radius of curvature of the tested curved crop rows in the field was 19 m. This measurement was obtained by applying topographic calculations. A smaller radius makes it difficult to navigate the tractor in the field and the risk of damaging crops is high.

Ground truth images

The performance of the proposed method was compared according to the similarity of their results to ground truth manually obtained for each image. The ground truth was obtained as follows. Based on visual observation in the images, an expert defined at least five points lying on a crop row in a test image, including the points for the four crop rows crossing the base of the ROI. An application developed in Matlab, based on the curve fitting toolbox, was used to fit automatically a quadratic curve passing through these points.

A total of 920 images with plants under different growth stages (up to 45 days) were randomly selected for testing. The test images were divided into three sets: Set-1 containing only straight crop rows with 120 images, Set-2 containing curved crop rows equally spaced (horizontally) with 300 images and Set-3 containing curved crop rows non-equally spaced with 500 images. Figure 16 shows examples of ground truth from each test image. Set-1 from (a) to (c), Set-2 from (d) to (f) and Set-3 from (g) to (j).

Fig. 16
figure 16

Samples of image ROI for each test image set together with CRDA and CRDA* values respectively; ac Set-1 containing only straight crop rows; df Set-2 with curved crop rows equally spaced and gj Set-3 with curved crop rows non-equally spaced. a CRDA = 0.908. b CRDA = 0.913. c CRDA = 0.934. d CRDA = 0.867. e CRDA = 0.886. f CRDA = 0.828. g CRDA* = 0.842. h CRDA* = 0.823. i CRDA* = 0.871. j CRDA* = 0.754

Comparison methods and performance measures

Hereinafter, the proposed method for detecting both curved and straight crop rows is denoted by DAGP (Detection by Accumulation of Green Pixels). Its performance was studied in terms of quantitative analysis and the methods used for comparison were the following: (i) Standard Hough Transform (HT) proposed by Hough (1962), which has been broadly used for straight crop row detection. Several constraints were applied for improving the performance of this approach, such as number of lines to be detected equal to four, the inclination angle ranging between −45° and 45°, the resolution of the distance and the angle fixed to 1 pixel and 1°, respectively. (ii) Linear regression based on the Theil-Shen estimator (LTS) proposed by Guerrero et al. (2013) to adjust the crop rows. (iii) Linear regression based on least squares (LRQ) proposed by Montalvo et al. (2012), which uses templates for restricting the areas where crop rows are expected. (iv) Crop row detection (CRD) proposed by Romeo et al. (2012), based on image perspective projection that looks for maximum accumulation of segmented green pixels along straight alignments. (v) Template matching followed by Global Energy Minimization (TMGEM) proposed by Vidovic et al. (2016), which detects regular patterns and uses a dynamic programming technique for determining an optimal crop model.

A measure referred to as crop row detection accuracy (CRDA) was defined for performance and is computed by matching horizontal co-ordinates (x i ) for each crop row obtained by each method under evaluation (HT, LTS, LRQ, CRD, TMGEM) to the corresponding ground truth values (\(\hat{x}_{i}\)) according to the matching score by Eq. 6.

$$matching\_score(\hat{x},x,d) = max\left( {1 - \left( {\frac{{\hat{x} - x}}{0.1*d}} \right)^{2} ,0} \right)$$
(6)

where d is the inter-row space in pixels for each image row.

Then the average of the matching scores for all image rows is computed, which represents a value ranging [0, 1] and is obtained by Eq. 7. Values close to 1 determine the best performances.

$$CRDA = \frac{1}{n*r}\mathop \sum \limits_{i = 1}^{r} \mathop \sum \limits_{j = 1}^{n} matching\_score(\hat{x}_{i,j} ,x_{i,j} ,\hat{d}_{i} )$$
(7)

where n is the number of crop rows to detect (i.e. n = 4), r is the number of image rows (i.e. r = 650).

CRDA was used for the evaluation on straight crop rows (Set-1) and curved crop rows equally spaced (Set-2). A slight variant of CRDA was proposed in this paper and used for the evaluation on curved crop rows non-equally spaced (Set-3). This new matching score is referred to as CRDA*, which uses maximum(d) instead of d value in Eq. 6.

Evaluation of the DAGP method

The performance of the DAGP for detecting both straight and curved crop rows was evaluated through three tests (Test 1–3). CRDA was used as the performance measure for Tests 1 and 2, while CRDA* was for Test 3. In Test 1, DAGP was compared to the HT, LTS, LRQ and CRD on the image Set-1. In Test 2, DAGP was compared against TMGEM on the image Set-2 and in Test 3, it was compared to the TMGEM on the image Set-3. HT, CDR and DAGP were implemented in Matlab, while that LTS, LRQ and TMGEM in C++. Below the Test 1–3 are detailed.

In Test 1, the average values, standard deviations and rankings of CRDA for image Set-1 are presented in Table 1. From these results, it can be seen that for average values, DAGP performance is higher than HT but slightly lower than other methods LTS, LRQ, CDR on straight crop rows. In Test 2, the values of CRDA for image Set-2 are presented in Table 2. From these results, it can be seen that for average values, DAGP is also slightly lower than TMGEM method on curved crop rows equally spaced. Regarding the standard deviations, it can be seen that all fall within the same order of magnitude. Thus, considering together means and standard deviations, DAGP achieves similar performances to LTS, LRQ, CDR and TMGEM but outperformed HT.

Table 1 CRDA for Test-1 on straight crop rows
Table 2 CRDA for Test-2 on curved crop rows equally spaced

In Test 3, the values of CRDA* for image Set-3 are presented in Table 3. From these results, it can be seen that for average values, DAGP outperforms the TMGEM method on curved crop rows non-equally spaced. DAGP has average CRDA* of 0.856, while that TMGEM has 0.548.

Table 3 CRDA for Test-3 on curved crop rows non-equally spaced

From the Tests 1–3, it can be seen that the results of DAGP vary slightly among them. Therefore, it can be concluded that the proposed method is moderately insensitive to curvature of crop rows and can detect straight and curved crop rows equally and non-equally spaced with high accuracy in maize fields by using a vision system conveniently arranged onboard of a tractor with a previous knowledge of the curvature (either toward left or right) of the crop rows. Figure 16a–j shows illustrative examples representing 10 images from the 920 available (Set 1–3). The computed crop rows are shown together with the CRDA and CRDA* values respectively, according to the extrinsic and intrinsic parameters of the visual system.

The computational cost was also computed. Table 4 shows the averaged processing times in percentage (%) and milliseconds (ms) of the overall method as well as distinguishing between the image processing modules for the three sets of tested images. DAGP had an average total execution time of 692, 712 and 721 ms for each image set 1–3 respectively. The segmentation module accounted for 43.6% of total time, identification of starting points 7.5% and crop row detection 49.0%.

Table 4 Average execution times in percentage (%) and milliseconds (ms) for each module of the DAGP method

Considering that, in general, agricultural vehicles work at speeds ranging between 3 and 6 km h−1, this means that in the worst case (6 km h−1) to travel the 5 m of the ROI (length) the vehicle needs 3 s, which is much greater than 721 ms (in the worst case) required for image processing. In addition, it must be considered that the running time was measured in Matlab GUIDE using an interpreted programming language. It may decrease significantly if the method is implemented by using a compiled programming language (e.g. C++), running on a real-time platform and operating system, e.g. LabView and CRio as in the RHEA (2014) project. Under this implementation, the processing time could be reduced by about 40%, as reported in RHEA, improving considerably the execution time, which could be useful for real-time applications and it is a topic for future research.

Impact of crop row curvature

The performance of the method has also been evaluated for detecting curved crop rows oriented toward the right (Fig. 15a) and it was tested on 50 additional images containing straight and curved crop rows. The results obtained were similar to those presented in Tables 1, 2 and 3. The extra execution time required for both the vertical specular reflection process and translation operations is less than 20 ms, i.e. the additional time proves to be insignificant and acceptable for the method.

Table 5 shows the range of coefficients of higher degree for both types of row. This means that the crop rows analyzed are limited to a group of straight and curved rows, because of these ranges that define the mathematical models, Eq. 2, i.e. straight rows with the slope (m) and quadratics with the coefficient of curvature (a). Thanks to this study, the limits of these parameters for each type of row have been achieved. This allows an important improvement during the verification process, when the rows must be accepted or rejected depending on the computed values for the coefficients (m, a) being in or out of these ranges.

Table 5 Range of allowed values for the coefficients of greater degree for each crop type on straight and curved crop rows oriented toward left and right

Finally, regarding limitations of the proposed method, two constraints on application should be considered: (i) the concavity of crop rows (either left or right) must be known a priori and (ii) the limited application to different crop row arrangements and the extrinsic and intrinsic camera parameters.

Conclusions

The study proposes a new computer vision method to detect curved and straight crop rows in maize fields for initial growth stages in crops and weeds (up to 45 days). The approach consists of three linked phases: segmentation, identification of starting points and crop row detection.

It has been proven to be robust enough under uncontrolled lighting conditions by using a camera mounted on the front of a tractor with in imaging perspective projection. It has been successfully tested with plants under different growth stages as well as with incomplete rows and different weed pressures and irregularly distributed in the inter-row spaces (Fig. 1). In short, the proposed method works properly on straight crop rows (CRDA >0.91 in Table 1), curved crop rows equally spaced (CRDA >0.86 in Table 2) and curved crop rows non-equally spaced (CRDA* >0.85 in Table 3) with concavity oriented toward left or right (not simultaneously, Fig. 15a, b), which is known a priori, in adverse and unexpected situations which could appear during the crop row detection procedure, which are common in uncontrolled agricultural environments with acceptable results and processing times (≤721 ms in Table 4).

Additionally, the robustness of the proposed automatic method is complemented by the two previously indicated procedures: (i) Selection and verification rows, where several anomalies can be assumed such as the number of detected rows, the separation of the crop rows (Fig. 14a) and their intersection (Fig. 14b), due to the significant presence of gaps and weeds. (ii) Controlling the crop rows with value ranges of the greater degree coefficient for each line type (m, a) as described in Table 5.