Keywords

3.1 Introduction

The geometric characterization of crops was and sometimes still is usually based on manual measurements that are prone to errors and have large associated labour and time costs. Consequently, the development of automated methods to obtain plant features in a quick, accurate and efficient way has attracted much interest. The simplest technique is to use digital cameras, based on a charge-coupled device (CCD) or complementary metal-oxide-semiconductor (CMOS) sensors, to estimate properties such as plant height or diameter. However, this technique has limited ability to obtain geometric characteristics due to the overlap between vegetative organs. Three-dimensional modelling is needed to acquire an in-depth knowledge of crop geometry. A myriad of photogrammetric techniques has been developed to generate three-dimensional models from digital photographs. Section 3.2 of this chapter reviews the two that are most often applied in crop structural characterization: stereo vision and structure from motion.

As an alternative to photogrammetry, a set of active sensors can be used based on the emission of a signal and the detection of its return after interaction with the target under study (i.e. the crop). Unlike photogrammetric techniques, these measurements are not affected by lighting conditions, although they do not provide colour information. Depending on the signal type, it is possible to distinguish between ultrasound sensors, which emit high-frequency mechanical waves, and optical sensors, most notably used in light detection and ranging (LiDAR) systems. These two typologies are presented in Sects. 3.3 and 3.4, respectively.

A new family of sensors, known as depth cameras, has emerged in recent years. These cameras can take pictures with range (depth) data on a pixel basis, as described in Sect. 3.5. Although the resolution is lower than in photogrammetric or LiDAR-based techniques, these are low-cost sensors which, in many cases, also provide simultaneous colour information (red, green, blue-depth – RGB-D – sensors). Finally, Sect. 3.6 ends this chapter with a brief discussion on future trends in crop geometry sensing.

As an introduction to each specific technology, Table 3.1 offers a comparison of the main specifications (range, wavelength, resolution, price, and so on) of several commercial sensors used for the electronic characterization of crops. The technologies behind these sensors and some examples of successful research studies are discussed in this chapter.

Table 3.1 Specifications of different commercial sensors used for crop geometry and structure characterization

3.2 Photogrammetric Techniques

This section presents the principle of operation of stereo vision and structure from motion techniques, and reviews their application in crop geometry characterization. Other photogrammetric techniques, such as shape-from-silhouette (Shlyakhter et al. 2001), shape-from-focus (Billiot et al. 2013) and photometric stereo (Bernotas et al. 2019) are also used in agriculture, although to a lesser extent.

3.2.1 Stereo Vision

Stereo vision (SV) is a photogrammetric technique inspired by human vision which allows 3D information of a scene or object to be obtained from images taken simultaneously by two monocular cameras, although multifocal cameras can also be used (Rovira-Más et al. 2009). Depth information is obtained by identifying homologous (common) pixels in both images and measuring the variation in their position, a parameter known as disparity. The following steps need to be completed to obtain 3D point clouds: (1) camera calibration, (2) stereo rectification, (3) stereo matching and (4) point cloud reconstruction. Calibration consists of obtaining both intrinsic camera parameters (e.g. focal length, lens aberration) and extrinsic parameters (position and orientation of images). Stereo rectification aims to align the epipolar lines of both images (i.e. the lines that connect the real point being measured and the centre of projection of both images). This simplifies the search for homologous pixels in the two images since these are always located in the corresponding epipolar lines. The typical arrangement of an SV system is shown in Fig. 3.1, where P(x, y, z) is a real point in the scene, while xL and xR are their horizontal projections on the left and right images, respectively. The third step of the process is to find homologous pixels, also known as stereo matching, and aims to determine the disparity between the two views. Finally, a 3D point cloud reconstruction is performed based on the disparity image and applying geometric triangulation. Following the example of Fig. 3.1, the depth R is computed using the following expression (Rovira-Más et al. 2011):

$$ R=\frac{b\cdotp f}{d}, $$
(3.1)

where b is the baseline or distance between the lenses of both cameras, f is the focal length (the same is assumed for both cameras) and d = xL − xR is the disparity.

Fig. 3.1
figure 1

Binocular stereo vision system

An important advantage of SV is its ability to generate high resolution 3D point clouds. During the processing of the point cloud, the availability of colour information allows more efficient segmentation. In addition, in an SV-based system both images are captured simultaneously using cameras attached to a fixed frame, thus minimizing the risk of failure associated with moving elements. However, SV systems require prior calibration and the results are affected by the robustness of the stereo matching algorithms, as well as by the prevailing lighting conditions. The need to process large amounts of information has traditionally been a major difficulty associated with SV systems. In recent years, the development of increasingly fast processors, together with the introduction of low-cost stereo cameras (Li et al. 2017; Oliveira et al. 2018) has allowed the use of SV in real-time applications.

In the field of geometric characterization of crops, Ivanov et al. (1995) were pioneers in using SV to obtain 3D models of maize (Zea mays L.) plants. In their study, the stereo matching was done manually, labelling the leaves and identifying the homologous points in their contours. Estimates of the horizontal profile of the leaf area index (LAI), vertical profile of the leaf area density (LAD), as well as of leaf position and orientation, were made. Many of the SV studies have been carried out under controlled laboratory conditions. In this way, He et al. (2003) developed an SV system for transplant growth analysis. A sweet potato transplant population was measured, and strong correlations were obtained between the destructively measured mass and the estimated mass volume with a coefficient of determination of R2 = 0.88. For their part, Andersen et al. (2005) obtained 3D images of ten wheat plants using simulated annealing during the stereo matching process. These authors determined the size of the plant and its leaf area, with good agreement with actual values. The SV was also applied by Biskup et al. (2007) in a laboratory study of the daytime and nocturnal movement of a soya bean (Glycine max L. merr.) plant and to quantify its drought stress from the zenith leaf angle distribution. The system was also used to monitor a real soya bean plantation, verifying its robustness under field conditions. Müller-Linow et al. (2015) continued the previous work by developing more robust image processing routines to generate 3D reconstructions of the plants and to determine the LAI and the leaf angle distribution. Trials were conducted with sugar beet plants, and it was observed how leaf angles vary throughout the season and differ depending on the variety studied. The latest advances in matching algorithms have allowed Bao et al. (2019a) to develop a robotic platform for field phenotyping of sorghum (Sorghum bicolor L. Moench) plants. The system comprises six side-viewing camera pairs and can be used for the measurement of multiple properties such as plant height, plant width, convex hull volume, plant surface area and stem diameter. Other SV applications include 3D tree reconstruction for automated blossom thinning (Nielsen et al. 2012), geometric characterization during the plant seedling stage (Xiong et al. 2017), and the development of systems to measure plant growth automatically, both under controlled indoor conditions (Yeh et al. 2014) and in greenhouses and outdoor fields (Lati et al. 2013).

In addition, SV has been widely used in autonomous navigation systems of agricultural vehicles (Reid and Searcy 1987; Rovira-Más et al. 2004; Kise et al. 2005). In this context, it is worth noting the SV system developed by Kise and Zhang (2008a), which had the double goal of locating crop rows for tractor guidance and 3D crop mapping. This system was validated in a soya bean field with three different plots according to the days elapsed since planting. The height differences were clearly detected. In another project, Kise and Zhang (2008b) developed an SV sensor comprising two multispectral cameras placed on a terrestrial vehicle. The spectral information was combined with 3D data to generate multispectral 3D field images of the soya bean field. In this vein, Zhao et al. (2018) recently presented a prototype that combines SV and a concave grating spectrometer (450–790 nm). This allows simultaneous monitoring of the biochemical properties and morphological features of crops.

3.2.2 Structure from Motion and Multi-View Stereo

The combination of structure from motion (SfM) and multi-view stereo (MVS) techniques enables 3D point clouds to be generated from a set of images taken from different viewing angles (Ullman 1979; Seitz et al. 2006). Unlike SV , a single RGB camera is moved to different positions taking a large number of pictures with a large degree of overlap between them. In the SfM process, invariant feature points have to be identified in several images. For this purpose, numerous feature detector algorithms have been developed (Tareen and Saleem 2018), most notably the scale-invariant feature transform, SIFT, (Lowe 1999) and the speeded-up robust features, SURF (Bay et al. 2006). A sparse 3D point cloud is then generated, the intrinsic and extrinsic parameters of the camera are determined, and the camera position and orientation estimated. Camera projective matrices computed with SfM are used by the MVS approach to provide a dense 3D point cloud, increasing the number of points by two or three orders of magnitude (James and Robson 2012).

In the last ten years, the application of SfM–MVS has revolutionized 3D plant modelling, particularly when a high level of detail is required (Quan et al. 2006; Pound et al. 2014). The development of these techniques has been made possible by the increase in computing capacity and the development of commercial photogrammetric software that automatically generate 3D reconstructions (Probst et al. 2018). The SfM–MVS is a low-cost image-based technique and, as can be seen in Fig. 3.2a, provides very high resolution, accurate and realistic 3D representations. In addition, unlike SV , prior camera calibration is not required, it is not necessary to know its position or orientation. However, the acquisition of multiple images, as well as their processing, are important limitations in real-time applications.

Fig. 3.2
figure 2

The 3D point clouds of an apple (Malus domestica Borkh.) tree generated using: (a) structure from motion and multi-view stereo, (b) a LiDAR-based system (colour denotes intensity of returns) and (c) a depth camera

The SfM–MVS has been widely applied to plant phenotyping in indoor conditions. Rose et al. (2015) used this technique to monitor the growth of five tomato plants for six days. Between 40 and 70 photographs per plant were taken to generate the 3D point cloud. From this, the leaf area and the main stem height were determined, obtaining a strong correlation with a coefficient of determination of R2 > 0.96 with the reference measurements made with a laser sensor. Similarly, Santos and Rodrigues (2016) used a low resolution camera (1.3 MP) and MVS for 3D reconstruction of sunflower (Helianthus annuus L.) and maize plants, both in the initial and in the final stage of development (2 m high). For the maize plant, the leaf lengths were determined with errors below 9 %. In an experiment with cucumber (Cucumis sativus L.), pepper (Capsicum annuum L.) and eggplant (Solanum melongena L.), Hui et al. (2018) concluded that a minimum of 60–80 images were required to reconstruct 95 % of the leaves. Duan et al. (2016) also demonstrated the possibility of using SfM–MVS to study plant growth (in this case, wheat) in greenhouse conditions.

Recently, SfM–MVS has been applied in crop phenotyping under field conditions. Jay et al. (2015) generated 3D models of five plant species (sunflower (Helianthus annuus L.), savoy cabbage (Brassica oleracea var. sabauda L.), cauliflower (Brassica oleracea var. botrytis L.), Brussel sprouts (Brassica oleracea var. gemmifera L.) and sugar beet (Beta vulgaris L.)). The RGB images were acquired by displacing a camera along the row. The plants were discriminated from the environment on the basis of colour and plant height. Other authors used SfM–MVS to estimate crop growth in the field. In this context, Zhang Y et al. (2018b) monitored the evolution of sweet potato plants over four months and under six different fertilization conditions. Good estimations for plant height, leaf number and area were obtained (R2 > 0.97). For their part, Mortensen et al. (2018) mounted a stereo camera on an agricultural robot and monitored several Cos (Lactuca sativa L. var. Cos) and Iceberg lettuces (Lactuca sativa L. var. Iceberg) over five weeks. Correlations were strong between the estimated surface areas and the actual fresh weight with coefficients of determination of R2 = 0.84–0.94). Likewise, Andújar et al. (2018) detected and classified different weed species in a maize field. This work demonstrated the use of the SfM–MVS technique to obtain dense point clouds from which reliable plant models can be generated. Their results agreed with those obtained by Martinez-Guanter et al. (2019), who compared 3D crop models (maize, sugar beet and sunflower) using SfM and RGB-D systems.

Fruit detection, including of wine grapes (Herrero-Huerta et al. 2015) or apple (Gené-Mola et al. 2020b), is another area where SfM–MVS has a strong potential. In this context, Rose et al. (2016) developed a platform for field vineyard phenotyping with RGB cameras and an on-board real-time kinematic-global navigation satellite system receiver (RTK-GNSS). The MVS was applied to georeferenced images, obtaining 3D point clouds of the vineyard with which it was possible to detect the number and size of berries.

As has been shown, SfM–MVS has the potential to create 3D plant reconstructions at the organ level. However, its large-scale field application currently faces several challenges, including the development of automatic image acquisition and processing systems and the need to resolve wind-related limitations (plant position can vary from one image to another).

3.3 Ultrasonic Sensors

Ultrasonic sensors use the transmission and reception of an ultrasound wave signal (a graphical description of a wave emission can be seen in Fig. 3.3) to detect objects and some of their properties at certain distances. The distance to a target is found by measuring the time interval between wave emission and reception, a principle known as time-of-flight measurement. The properties of the target can be interpreted by analysing the echo wave that returns from the target (echo wave in Fig. 3.3). Using a piezo-electric or di-electric membrane device, the sensor is able to generate ultrasonic sound wave packages that bounce back from the target and, after a very short time interval, are usually received and read by the same device or using a specific membrane to act as receiver.

Fig. 3.3
figure 3

Typical acoustic waveform: both emitted and returned wave are shown

The simplest analysis technique is the ‘pulse-echo’ method, where a short burst of ultrasound is emitted and the time taken by the signal to return is measured. The range resolution, the smallest distance that can be measured, is determined by the resolution of the time measurement. The time is found considering the phase of the returned signal, and the higher the frequency the better is the resolution. However, it is not possible to indefinitely increase the frequency to improve distance resolution because, as is well known, ultrasound is attenuated as it travels through a medium and this attenuation increases with frequency (Wykes et al. 1994).

Sprayer control systems have been developed, using different techniques, which sense the presence or absence of target plants and control the sprayer output in an on–off or selective manner (Reichard and Ladd 1981; Ladd et al. 1981). At the time of their development, these systems detected only the presence of targets using photoelectric systems but not actual target characteristics. In 1987, a commercially available orchard air-blast sprayer control system (Roper Grower’s Cooperative, Winter Garden, FL) represented a refinement of the selective control approach. Five ultrasonic transducers on either side of the sprayer could detect the presence or absence of tree foliage, and spray nozzles were then activated accordingly (Giles et al. 1987).

With respect to the performance and accuracy of this technology, different scientific studies have been carried out to verify the feasibility of using this kind of sensor for crop measurement. In 2007, Zaman et al. proposed a study to investigate the errors in citrus canopy volume measured with a 10-transducer ultrasonic orchard measurement array. The factors studied were ground speed accuracy, uncalibrated air temperature and the effect of ultrasonic transducer crosstalk. Deviations in the driving path from the centreline between two rows and the height error in the transducer array due to improper tyre inflation and uneven ground were also estimated. A few years later, Jeon et al. (2011) presented further research in which ultrasonic sensors were subjected to simulated environmental and operating conditions to determine their durability and accuracy. Conditions tested included exposure to extended cold, outdoor temperatures, cross-winds, temperature change, dust clouds, travel speeds and spray effect. They verified that these conditions have to be taken into account in the process to determine canopy distance in real and simulated plant canopies.

Similar studies were carried out on apple and olive (Olea europaea L.) trees by Escolà et al. (2011) and Gamarra-Diezma et al. (2015), respectively. In those cases, the main objective was to obtain a good distance measurement to estimate crop volume at different tree heights. In the apple tree study, they found that the sensors under study were required to be at a minimum distance of 60 cm apart because of errors from interference at ˂10 cm. In this case, the narrow distance between crop rows (<4 m) contributed to ensuring operation with short inter-sensor distance because the interference effect caused by the reflection of the subsequent sensor’s wave was less important. In olive tree plantations, where there are wide row distances (>10 m), distance between sensors was increased to 130 cm. Accurate volume measurement is an important aspect in the process of tree crop characterization. By using ultrasonic sensors, it is possible to obtain this measurement if the distance between sensors is taken into account.

Ultrasound can also be used in field and bush crop measurements. For example, an ultrasonic sensor is a simple way to measure distance and can be mounted on agricultural machinery to improve harvesting procedures. Zhao et al. (2010) used two ultrasonic sensors mounted on either side of a combine harvesting header to detect presence of the crop and to measure width in the header. With that information, they could determine when the system needed to start and stop collecting data, as well as the real cutting width which is an important parameter in a yield monitor.

With its ability to measure plant height, the ultrasonic sensor can be used as a weed sensor detector, especially when differences between crop or soil and weeds are evident. Andújar et al. (2011, 2012) proposed different studies in weed detection using ultrasound sensors and applied the technique to two important field crops, maize and winter wheat (Triticum aestivum L.). In maize, where the aim was to differentiate grasses and broad-leaved weeds, the sensor’s static readings permitted discrimination of 81 % of pure grasses and up to 99 % of pure stands of broad-leaved weeds. With dynamic measurements, its potential to detect weed infestations was confirmed. In the winter wheat application, the authors considered the hypothesis that weed-infested zones have more biomass than non-infested areas. Among the main results, they found that ultrasound readings could discriminate areas with weed infestation with a success rate of about 92.8 %. After detecting a weed infestation, a method to reduce it can be applied. In this respect, Zaman et al. (2011) proposed a weed spraying system using eight individual nozzles with an ultrasonic sensor on each to detect the weeds and activate the spraying. The selective application was adjusted depending on weed height above the wild blueberry (Vaccinium angustifolium) crop. In this case, weed height was between 35 cm and 55 cm higher than that of the crop, which ranged from 12 cm to 30 cm.

For the non-destructive assessment of forage mass in legume-grass mixtures, Fricke et al. (2011) proposed the use of an ultrasonic sensor yield mapping system in two field experiments. Important conclusions were drawn from stationary and on-the-go experiments on pure swards and mixtures of legumes and perennial ryegrass (Lolium perenne L.). In this research, they georeferenced the ultrasonic sensor readings using a high-accuracy differential global positioning system (DGNSS). Forage mass, calculated on the basis of information gathered from ultrasonic height measurements, could be explained at an acceptable range of accuracies (R2 from 0.6 to 0.86). The relationship between ultrasonic sward height and forage mass was affected by various factors: sward type, weed abundance, abrupt forage mass changes and growth periods. At the end of their project, they proposed further lines of research to improve the technique, including combining information from image sensors.

In this particular respect, sensor fusion could be an interesting option when ultrasonic sensors on their own cannot extract all the information required for a specific purpose from the environment. With this in mind, Farooque et al. (2013) integrated a system comprising an ultrasonic sensor, a digital colour camera and a slope sensor. All the data from the sensors were georeferenced with an RTK-GNSS receiver. This integrated system was applied in a commercially managed wild blueberry field, with the sensors mounted on a specially adapted harvester. In this research, as in previous studies, the ultrasonic sensor was used to measure plant height which was then correlated with fruit yield determined by counting blue pixels captured by the colour camera. All the data were processed and represented on a map in which five differentiated zones were defined. In a study by Fricke and Wachendorf (2013), ultrasonic height measurements were also taken, but in this case the information was combined with spectral-radiometric reflection measurements. To determine whether biomass predictions made using only the ultrasonic sward height could be improved by complementing the measurements with vegetation indices VIs, various hyperspectral VIs were evaluated. Spectral reflections measured on the canopy appeared to explain differences detected in biomass measurements. In a more recent paper, Yuan et al. (2018) proposed the objective of detecting and measuring wheat plant height by comparing ultrasonic sensor, LiDAR and data from unmanned aerial vehicles (UAVs), and the results were compared with manual measurements. The ultrasonic and laser sensors were mounted on a ground-based multi-sensor phenotyping system. The ultrasonic sensor performed worse than the other systems.

Upchurch et al. (1993) designed an intelligent tree trunk diameter calliper. The system was calibrated to work with objects (tree trunks in this case) with diameters from 1.6 cm to 19 cm with a small mean error of 0.04 cm. Tree trunk diameter is an interesting property to measure in orchard production, and in this case the measurement technique was improved and automated by means of ultrasonic sensors.

The use of ultrasonic sensors in orchard crops has facilitated the gathering of important data to improve specific field tasks. In the 1980s, two interesting studies by McConnell et al. (1983) and Giles and Delwiche (1988) were published with similar titles: “Electronic measurement of tree-row volume ” and “Electronic measurement of tree canopy volume ”, respectively. Their objective was to measure the distance to the foliage of the crop from the position of the sensor; this can be done from the centre of two rows in row measurement or from the tree side in measurements of individual trees. In McConnell et al. (1983), a mast with a range of transducers was mounted to read the shape of the crop at different heights. In Giles and Delwiche (1988), three ultrasonic sensors were mounted on each side of the sprayer to characterize the crop as the target of a spray application. Subsequently, this process was applied in citrus crops by other authors. Tumbo et al. (2002), for example, mounted a total of 20 ultrasound transducers, 10 on either side of a mast mounted behind a tractor, to read the crop shape. With this number of sensors, cross-talk interference was evident, so the system was designed to trigger the sensors in sequential groups of three to prevent this effect. When the effect of canopy foliage density and the ground speed of the system was tested in citrus trees (Zaman and Salyani 2004), it was found that canopy foliage density had a significant effect on ultrasonic volume estimation, with greater differences obtained in low density canopies. However, there were no differences in volume measurements at ground speeds of 1.6 km h−1 to 4.7 km h−1. Later, the information provided by the ultrasonic sensors was mapped using DGPS (Schumann and Zaman 2005). For that study, interpretation of tree height and volume was made using the data from 10 ultrasound sensors mounted on a vertical mast at 0.6 m increments, all reading the same side of the row. That required two passes on each alleyway to cover all of the field. The mast was mounted on a trailer pulled by a car. Measurement accuracy was reduced given the short distance between sensors, but it was sufficient to provide maps of application rates to adjust sprayers.

A year later, an electronic sprayer control system was proposed by Solanelles et al. (2006) to adjust a variable-rate spray to different crops: olive, pear (Pyrus communis L.) and apple. In this case, information from the ultrasonic sensor was used to estimate tree canopy width which was then compared to the maximum tree width of the whole orchard. On the basis of the ratio obtained, different solenoid valves were adjusted to supply the appropriate amount of liquid spray to the canopy.

Other studies have since been performed to adapt the principle of canopy measurements for variable-rate spray application to grapevines (Gil et al. 2007, 2013; Llorens et al. 2010). The objective of the ultrasonic measurements was to calculate vine volume, then using an application coefficient, defined as litres of liquid per cubic metre of crop, the sprayer output was adjusted using proportional solenoid valves.

As can be seen in Fig. 3.3, it is also possible to extract information from the analysis of the energy of the echo wave after it bounces off the target. This type of analysis allows prediction of another interesting property, the canopy density. This variable can be used to improve estimates of volume on the basis of information about how the crop is organized internally within the estimated volume. Stajnko et al. (2012) proposed an analysis of the envelope on 15-cm ultrasound bands to measure canopy size and density, and then adjust spray application accordingly. Pallejà and Landers (2015) analyzed the returned ultrasound wave in a more in-depth and accurate way. For their research, they used four ultrasound sensors for each side of the crop (grapevine and apple) which were activated synchronously to prevent interference. With this system, they were able to provide an estimate of canopy density specific to each side of the crop, with an error of 14 % in vineyards and 3.8 % in apple trees.

Since 2015, little research using ultrasonic sensors has been published mainly because of the decrease in use of this kind of sensor because of advances in other technologies, for example LiDAR sensors, which provide better and faster data for the electronic characterization of crops. Some of these sensors are described in the following section of this chapter.

3.4 Optical Sensors

The main optical sensors used in canopy characterization are described below. First, photoelectric sensors are introduced as a simple technology to detect the presence of and the distance to a target. Second, the LiDAR principle is presented as a well-known but complex technology that is used to extract interesting and useful information from the surroundings; in this case, surroundings composed of different crop structures.

3.4.1 Photoelectric Sensors

Photoelectric sensors are active sensors able to detect objects from the interruption or reflection of light, often infrared, within a defined sensing range. Depending on how the sensor is used and the type that is employed, they permit the presence or absence of one or various objects to be determined. In the case of reflection type sensors, (Fig. 3.4), the objects located beyond the sensing cut-off point are disregarded. These sensors are usually low-cost, compact and rugged.

Fig. 3.4
figure 4

Types of photoelectric sensors. (a) through-beam system, (b) retroreflective system and (c) diffused system 

These photoelectric sensors are commonly used in industrial environments to detect the position of physical elements through the interruption of a fixed optical light beam. They are usually applied for near-range detection and have a specific range that depends on the type used (Fig. 3.4). Wide ranges of detection are not possible. There are three main types of these easy-to-mount sensors. In a through-beam system (Fig. 3.4a), separate sensor, emitter and receiver devices are installed on either side of the area of work. In a retroreflective system (Fig. 3.4b), the light source and receiver unit are in the same housing and the sensor is aimed at a reflector which sends the signal back to the receiving unit when there is no object in between. Finally, in a diffused system (Fig. 3.4c) the light source and receiver are housed in the same device and the light beam is reflected back by the target.

Although used far less than in industrial environments, some studies used these sensor types or adapted versions of them, to detect vegetative elements. Some examples of these studies are described below.

Hooper et al. (1976) designed a photoelectric sensor to distinguish between plant material and soil. They analyzed the differences in plant vs. soil reflectivity to detect the plant structure. The operation of the system depended on measuring the ratio of visible to near-infrared reflected radiation, which is less for leaves than for soil due to the absorption of visible red radiation by chlorophyll. For this specific sensor, the working principle is somewhat different from the principle described in Fig. 3.4. In this case, the sensor needs to be able to read the reflected radiation. To do this, the receiver measures the intensities of near-infrared and visible radiation reflected from a surface illuminated by the emitter.

The photoelectric sensor has also commonly been used to adjust machinery output. A selective applicator for post-emergence herbicides in row crops was developed and tested by Shearer and Jones (1991). In this study, the presence of weeds was sensed using a modulated NIR light source and phototransistor receiver. A solenoid valve was activated, or deactivated as required, through a logic circuit to release herbicide through the spray nozzle. The results showed herbicide savings of 15 % with no adverse effect on weed control.

Ladd et al. (1981) proposed a selective sprayer using pulse-modulated solenoid valves. In this case, IR sensors were used to detect plants and trigger pesticide spraying. The crops detected with this sensor were cabbage, cauliflower and pepper. Field testing of the system confirmed that pest control was maintained while pesticide use was reduced by 24–51 %. In this case, the field crop system developed required the target to be able to pass through the emitter-detector sensor pair.

Miller et al. (2003) tested a variable-rate granular fertilizer spreader with both GPS-guided prescription mapping and real-time tree presence measurement with photoelectric sensors in citrus groves. The variable-rate fertilizer unit performed well for site-specific fertilizer application with large-scale variability within a grove. They used six units of Banner QMT42 (Banner Engineering, Minneapolis, Minn.) long-range diffuse photocells to measure the citrus trees and to adjust the prescription rate output from the prescription map.

Some more recent research is a robotic system presented by Hayashi et al. (2010). The authors used the sensor to detect the presence of a piece of fruit that is manipulated by an arm of the robotic system. In the system, the photoelectric sensor works on a gripper that is responsible for grasping and cutting the peduncle of a strawberry while a suction device holds the fruit.

3.4.2 LiDAR Sensors

As an advancement of the use of the simple photoelectric directional sensors, LiDAR technology can aim the light beam in different directions. The LiDAR sensors use light, usually laser in infrared mode, to measure the distance to the sensor surroundings.

A general classification of LiDAR commercial solutions can be made according to the dimensions (directions of axis) scanned. Following this criterion, it is possible to distinguish 1D, 2D and 3D LiDAR sensors. The first type can only provide information in one direction, while 2D LiDAR sensors can provide information from two directions. To generate a whole 3D point cloud, the LiDAR system needs either to use two internal rotary mirrors to direct the laser beam to different angles, or to have the ability to rotate its head 360° to scan the surroundings. If the LiDAR sensor has neither of these capabilities, the sensor needs to be moved to different positions, for which a positioning system is required. The 3D LiDAR sensors can provide information from the surroundings in three directions when the sensor is stationary in one position. According to the operation system, there are solid-state sensors without mechanically moving parts and sensors with moving parts. Usually, those moving parts are rotating mirrors that are able to redirect a single laser beam to different directions.

The first step in the working process of the sensor is to generate a beam of infrared light that is redirected to a different direction using a mirror. Multiple sensor formats are possible, from a limited angular range of detection (short field of view) to a 360° angular range of detection. In all cases, the raw data received from the sensors are stored in polar format (angle and distance for each distance measurement) with the origin in the centre of the sensor. Then, using a basic trigonometry transformation, each measured point can be located in a Cartesian coordinate system. To generate a whole point cloud of a target the LiDAR system needs to be moved to different positions, for which a precise positioning system is required. Simultaneous location and mapping (SLAM), inertial measurement units (IMU ) and GNSS receivers are used to register the different scans and to geo-reference the position of the sensor. The SLAM algorithms enable the LiDAR sensor position to be estimated in relation to the points that are captured instantly and in previous scans. In this process, the movement is estimated sequentially by comparing point clouds. This allows a precise map of 3D point clouds to be created using identified features captured and located in the scanning process. The IMU sensor is an electronic device that enables inertial information (orientation, angular rate, force and acceleration) of the LiDAR sensor movement to be recorded during the scanning process. With that information, it is possible to correct or compensate the LiDAR positioning. The SLAM+IMU systems are combined with GNSS systems to provide better accuracy of the point cloud captured with LiDAR sensors.

This section will focus on terrestrial laser scanners (TLS ), laser-based systems that can be used statically (with 3D capture capability) or in motion after being mounted on a mobile or portable system (either with 2D or 3D capture capabilities). This chapter does not cover aerial LiDAR , which is commonly used in electronic forest characterization where large areas are scanned and processed. In our case, for close-range terrestrial crop characterization, the technique is called mobile terrestrial LiDAR or laser scanning (MTLS ), with the LiDAR sensors deployed using a terrestrial vehicle or platform to scan a crop. This system can be designed using all manner of terrestrial vehicles or indoor equipment, including all-terrain vehicles, cars, bicycles, movable platforms, specific engine-equipped prototypes, manual trolleys that are pushed alongside the crop or even backpacks to be carried by a person on foot. An example of a MTLS using a robotic platform is shown in Fig. 3.5a.

Fig. 3.5
figure 5

Procedure and point cloud results from terrestrial LiDAR scanning. (a) mobile terrestrial robotic platform with sensor and positioning system, (b) close-range point cloud of a scanned apple tree row section, (c) five-row almond orchard and (d) vineyard point cloud. All LiDAR point clouds are represented using a colour scale representing height

A brief description is provided below of the most important research studies that have been published to date on the use of LiDAR to extract information about agricultural crops.

An early system design in one of the first studies to use a LiDAR system was presented by Vanderbilt et al. (1979). They applied what was then a new laser measurement technique for the first time to measure the irradiance distribution in a vegetative canopy of wheat. The new technique was conceptually similar to the traditional manual point quadrat methodology, but using the capabilities of an electronic system. The laser light source was mounted on a tripod and moved at an angle (pan), measuring the height where each light beam first hit the foliage. According to the authors, in comparison with other available methods the laser technique was amenable to automation and applicable to canopies of various heights and at different measurement distances. This was in effect the beginning of laser-based crop measurements. A decade later, Walklate (1989) used a similar optical instrument for measuring crop geometry based on the probability of a light beam intersecting plant surfaces within a crop. In this case, the laser emitter was based on a helium-neon laser and the return from the target was detected with a backscattered light collection system. The system was applied to a barley (Hordeum vulgare L.) crop and the results compared with crop area density measurements obtained in destructive manual procedures. In that study, Walklate proposed that the LiDAR data processing methodology could be understood using the Poisson law, where the probability of a LiDAR laser beam intersecting a plant surface was taken as similar to that of a beam of sunlight penetrating a canopy.

As with other sensor types, when LiDAR sensor measurements are proposed for use in crop characterization, accuracy and performance are factors of fundamental importance that need to be taken into account. With this in mind, several authors have undertaken specific studies on the behaviour of these measurements under the effect of different interactions. When used to measure distances, LiDAR systems are faced with the problem that, due to its size, the laser beam may intercept more than one object, with partial interception on multiple targets. In such cases, the distance measurement could be affected by the way this interception is produced. This effect, known as mixed pixels or edge effects, was assessed by Sanz-Cortiella et al. (2011). In their study, the effect on distance measurement was analyzed when LiDAR LMS200 laser beams were blocked by different surface shapes. Mixed pixels usually appear when a canopy is scanned with LiDAR sensors. Their main conclusion was that in a partial beam blockage scenario, the distance measured with the sensor depends more on the blocked energy than the blocked surface area. Their research showed that it is important to know the dimensions of the laser beam cross-section in sensor simulations, and these also need to be known when selecting the LiDAR sensor.

Another clear example of the effect on LiDAR sensor performance relates to factors that can affect volume measurement in tree crops when no GNSS system is used, as described in Pallejà et al. (2010). According to the authors, the distance to the crop and the angular movement of the sensor are crucial to obtain accurate volume estimates. They concluded that all procedures for tree volume estimation should incorporate additional devices such as inertial measurement units (IMUs) or methods to control or estimate and correct the trajectory.

The LiDAR position and point of view are also very important, especially when the shape of the crop can create shadows that make it impossible to scan some parts of it. This effect can be considerable and difficult to solve with very dense crops. The effect of the position of the sensor in relation to the crops was studied in Bietresato et al. (2016), where they used a scanning system with two LiDAR sensors working at different heights and at the same time as a LiDAR stereovision system. Another important effect in crop characterization is when the scanning is incomplete (only one side of the crop row). When tree crops are characterized electronically, the sensor operator can decide whether the scanning system collects information from each side of the crop or not. In fact, according to Auat Cheein et al. (2015), it can represent a reduction in accuracy of up to 30 % in estimates of tree crown volume. Today, an automated crop characterization service does not need to cover all of the field, but a discontinuous systematic sampling procedure can be applied to extract a raster map of the field. For example, in del-Moral-Martínez et al. (2016) two LAI characterization strategies were proposed in vineyard crops: continuous on-the-go scanning with computing of 1 m LAI values along the rows, or discontinuous systematic scanning of specific sampling sections separated by up to 15 m along the rows. The resulting raster map of the field was unaffected by the method employed.

As in the ultrasonic sensors section, the following research studies in which LiDAR sensors were used to extract information, focus first on field crops and then on orchard crops.

LiDAR sensors have been used to measure and numerically describe the architecture of field crops. In this case, the sensor needs to be moved or placed in a high position over the ground to be able to read the top and interior of a crop that is normally extended over the soil surface. One early study is described in Vanderbilt et al. (1990), in which a LiDAR sensor was used to extract crop density in maize. In this case, the method used provided an interception coefficient for classes of vegetation in layers of the canopy viewed in various directions. Sometimes, the main crop is not the target of the laser readings, as for example with weeds that can grow in inter-row areas of maize fields (Andújar et al. 2013). In that study, the authors proposed and tested a procedure for weed characterization. The weeds were detected by measuring their height, which had a strong correlation with manually measured height values that had a coefficient of determination of R2 = 0.88. The ability of the system to discriminate between weeds was also carried out, with a success rate of 77.7 % in the classification of Johnson grass (Sorgum halepense).

In Lumme et al. (2008), other field crops (namely barley, oat (Avena sativa L.) and wheat) were scanned to extract information about growth heights, which was then used to study the effect of different nitrogen fertilization strategies on yield for each plot studied. In this particular study, a movable rack about 3 m high was used to have a better point of view and to move the sensor over the crop. In a similar study, a static 3D LiDAR scanner was used to scan a wheat crop (Hosoi and Omasa 2009). In this case, the sensor was mounted in different positions around the field to extract a complete three-dimensional model of the crop. Using LiDAR data and a profiling voxel-based method, the authors determined the plant area index and plant area density profiles, with the latter property was then used to estimate the actual dry weight of each organ type (ears, stems and leaves).

Sometimes, a particular crop characterization study is undertaken to improve harvest operations. This was the case in Saeys et al. (2009), where two different models of Sick LMS LiDAR sensors were used to scan small grain crop plots. In this research, the interesting option of using the sensor on a combine harvester to scan the amount of crop harvested was explored at the same time as the combine was harvesting the cereal.

Llop et al. (2016) showed that LiDAR sensors can also be used in greenhouses. In their study, they used the sensor to measure tomato (Solanum lycopersicum L.) crop properties. Crop layout, with narrow alleyways and tall and thin plants usually planted in pairs, made the scanning process difficult and meant that the sensor could see only one side of the plant. Nonetheless, despite this drawback they found a good correlation between manual and electronic measurements in volume and density variables.

An important factor in the case of orchards is the ability of the sensor system to generate a point cloud that defines the whole geometry of the crop, especially if the scanning process is carried out from different points of view. In general, as this kind of crop is organized in rows, the points of view are limited to two sides of the crop. With this limitation, all the information provided from each location needs to be synchronized correctly, and the whole point cloud can subsequently be processed as a unique element. One of the pioneers in this technique was the team led by Peter Walklate (Walklate et al. 2002) who used a tractor-mounted LiDAR system to obtain structural details of an apple orchard and compare the performance of different spray deposition models. The goal was to adjust the pesticide output from an axial fan sprayer based on the estimated crop structure properties.

Similar LiDAR sensors were subsequently used for characterization of a wide range of tree crops, including pear, apple, wine grape, citrus, avocado (Persea americana) and olive (Palacín et al. 2007; Rosell et al. 2009; Llorens et al. 2011; Moorthy et al. 2011; Pfeiffer et al. 2018). In those studies, LiDAR systems were developed to obtain 3D point clouds from which large amounts of plant information could be extracted to determine variables such as height, width, volume, LAI and LAD . For some scanned trees, the coefficient of determination between manually measured volumes and those obtained from the processed point cloud was as high as 0.976. In one study on the characterization of an olive tree crop (Escolà et al. 2017), the first and second returns from the readings were analyzed, while data from the third return were not considered reliable and were discarded. Then, to present crop property measurements, they used interpolation to represent the status of the crop as a digital map. These maps, available during the growing season of the crop, can be used to extract differences in growth rate between two crop stages, an important factor for analyses of the time dynamics of a field. To improve and speed up the process, some authors have proposed scan aggregation to obtain better LAI measurements, or using information from only one side to obtain the crop values instead of scanning the crop from both sides. This simplification was shown to give good correlations in north–south oriented vineyards, an important factor if this process needs to be applied in different situations (Arnó et al. 2013, 2015).

Given the potential of LiDAR technology to extract crop properties, large field areas can be scanned for different purposes. Underwood et al. (2016) undertook an extensive study in tree detection, and flower and fruit mapping in almond orchards. They measured 580 trees three times each season for two years. For this, they used a LiDAR system plus an imaging system. With the LiDAR sensor they located the trees and measured their volumes and with the imaging system they counted the flowers and fruits. Scanning large areas and combining these two different technologies is an interesting option for implementation in the near future.

With the aim of improving crop management operation, Escolà et al. (2013) carried out a real-time characterization of an orchard canopy using a LiDAR sensor to adjust a variable-rate algorithm for spray application. An application coefficient was used to convert canopy volume into the required spray flow rate. Good correlations with coefficients of determination of R2 > 0.9 between spray flow rate and canopy cross-sectional area were obtained. Siebers et al. (2018) also considered crop management processes in their study. After scanning vineyards grown under different management systems (single cordon and spur with pruned or minimally pruned vineyards) in two different locations, they obtained strong correlation coefficients between pruning weight and vine wood (trunk and cordon), with volume extracted from LiDAR point clouds.

A new and interesting use of LiDAR data can be seen in recent studies (Gené-Mola et al. 2019a; Tsoulias et al. 2020). They used the information provided by reflectance or intensity values (amount of infrared light returned from the target) to detect and locate apples. Differences in reflectance values between leaves, branches and apples can be sufficient to detect the position of the fruits, and using a clustering process the apples can then be counted. In a subsequent study, these results were improved significantly by applying forced air flow and using multi-view approaches (Gené-Mola et al. 2020a). This detection process obtained similar results to those obtained by imaging systems, but with the advantage of providing the coordinates of the fruit location. With the same objective, but targeting oranges, Méndez et al. (2019) used a static 3D LiDAR sensor with colour capture of each point to locate and count oranges in a commercial field. They tested the system on two different crop management strategies, pruned and unpruned trees. In the pruned trees they found a significant regression between actual and modelled fruit number (R2 = 0.63, p = 0.01), but because of problems with leaf occlusion the coefficient of determination was not significant in the unpruned trees.

The 3D LiDAR systems have also been used in electronic orchard characterization. In grapevine, for example, wood measurements were used to obtain the real crop volume (trunks and cordons) (Keightley and Bawden 2010), and in tomato the plants were scanned from three different positions (Hosoi et al. 2011). Using a post-processing procedure, it was possible to convert leaf points into polygons which were in turn used to estimate leaf properties including LAD , LAI and the leaf inclination angle. Similar to this process, but using intensity values at the same time, a system was proposed to extract the leaf inclination angle in pear trees (Balduzzi et al. 2011). The main key in the analysis was that the intensity information provided by TLS systems depends on the local inclination of the measured surface.

A particular case of the use of LiDAR sensors to improve combine harvester capabilities was presented in Zhang and Grift (2012). Using a Sick LiDAR , they evaluated its ability in static and dynamic mode to measure stem height in Miscanthus giganteus (Miscanthus x giganteus), a crop used for bioenergy production. The stem height measurement system that the authors were developing was intended for future use as a component of the so called ‘look-ahead yield monitor’ of a combine harvester. Mean errors of just 5.08 and 3.8 % were obtained in the static and dynamic approaches, respectively.

As ever larger crop areas have begun to be scanned using LiDAR sensors, automatic procedures to extract crop properties have become increasingly necessary. In their study, Colaço et al. (2017) scanned a 25-ha field using LiDAR with a GNSS receiver to geo-reference all the points obtained. After delimiting the individual trees along a row, they used convex-hull and alpha-shape algorithms to reproduce the shape of the crowns and then calculate the canopy volume and height. The estimated canopy properties were similar using the two algorithmic methods.

Finally, in relation to the recent development of full-waveform hyperspectral LiDAR systems which allow spectral and spatial information to be obtained of each measuring point, Zhang et al. (2019) proposed the use of a practical calibration approach to resolve the system’s problem of a lack of consistency in the pulse-eco arrival times of multiple spectral channels.

3.5 Depth Cameras

Since the 2000s, low-cost depth cameras have become available on the market. These devices can generate real-time 3D images without the need for mobile components. Sometimes the term RGB-D camera is also used, although this term should strictly be reserved for depth cameras that simultaneously provide colour and depth information. Current depth cameras are based on one of the following operating principles (Giancola et al. 2018): structured-light (SL ), time-of-flight (ToF) or stereo vision (SV).

3.5.1 Structured-Light Sensors

Structured-light is an active stereo vision technique based on the emission of a known light pattern by a projector (Khoshelham and Elberink 2012) that is distorted when hitting a target, generating a disparity map that is captured by a camera (Fig. 3.6). The distance is determined by applying the same expression as in stereo vision (Eq. 3.1) except that, in this case, the baseline is equal to the separation between the projector and the camera. The most popular SL camera is the Microsoft Kinect v1 (referred to hereafter as the K1), launched in 2010. This camera comprises an IR laser projector, an IR CMOS sensor and an RGB camera, and provides synchronized colour and depth images (Table 3.1). Although the K1 was originally developed for the gaming market, it quickly attained popularity in different research fields due to its ability to provide 3D colour images in real time (30 fps), its compact design and its low cost. Another advantage of SL cameras is that they require less power than ToF cameras and so active cooling is not necessary (Sarbolandi et al. 2015). The main limitations of the K1 are its short range (0.8–4 m) and that it was designed for indoor conditions, with the result that some depth data are lost if the camera is operated under direct sunlight. Conversely, deep images are available at night but colour information degrades. The best performance outdoors is achieved under low light conditions (e.g. sunrise, sunset or cloudy days) (Rosell-Polo et al. 2015).

Fig. 3.6
figure 6

Typical set-up of a structured-light system. The projected pattern is distorted by the object’s surface and then it is captured by the camera for 3D reconstruction

The K1 has been widely used for indoor plant phenotyping; Chéné et al. (2012) were the first to study this application. These authors segmented leaves of a rosebush (Rosa L.) from RGB-D data and measured the leaf curvature and orientation of a yucca (Yucca L.) plant. They also demonstrated the possibility of merging 3D plant images with thermal images to improve the identification of infected leaves. Xia et al. (2015) presented a complete methodology for 3D segmentation of plant leaves using K1 measurements that was validated under greenhouse conditions. Paulus et al. (2014) compared the K1 with two higher-cost laser scanning systems to measure plant geometry. They concluded that the K1 could replace these more expensive laser scanners in numerous phenotyping applications, particularly for characterizing volumetric objects.

With respect to field studies, Azzari et al. (2013) used the K1 at night to measure different plants (wild artichoke (Cynara cardunculus L.), bristly ox-tongue (Picris echioides L.)), estimating properties such as the basal diameter, plant height and volume. Rosell-Polo et al. (2015) combined the K1 with an RTK-GNSS positioning system to generate georeferenced 3D point clouds of apple tree rows, obtaining good correlations between the estimated tree heights and the physical values. These authors also applied the K1 to plant organ classification (leaves, flowers and branches) in apple and pear trees. As demonstrated by Andújar et al. (2016b), the K1 can also be used for plant growth monitoring.

Characterization of woody plants and estimation of biomass is another important application of SL sensors. Nock et al. (2013) tested the Asus Xtion camera to measure woody plant stems, concluding that it was capable of measuring branches with a diameter ˃6 mm. In this context, Andújar et al. (2015) analyzed what the best viewing angle was for biomass estimation in poplar (Populus L.) seedings. They concluded that the angle depends on the growth stage, with a top view appropriate for one-month-old plants and a frontal view for one-year-old plants.

3.5.2 Time-of-Flight Cameras

The ToF cameras are active imaging sensors that emit continuous-wave (CW) modulated light in the near-infrared (NIR) with light-emitting diodes (LEDs). As shown in Fig. 3.7, they are based on measuring the phase shift φ between the emitted and the received signal for each pixel (CMOS sensor). The depth R is computed as (Foix et al. 2011):

$$ R=\frac{c}{4\pi {f}_{\mathrm{m}}}\varphi, $$
(3.2)

where c is the speed of light and fm is the modulation frequency of the emitted signal.

Fig. 3.7
figure 7

Principle of operation of a ToF camera. (Adapted from Vázquez-Arellano et al. 2016)

Conventional ToF cameras provide depth data (depth image) and the intensity of the return signal (amplitude signal), but they do not give colour information. Some ToF cameras (e.g. Mesa Imaging SR4000 or PMD CamCube) also provide a confidence matrix to indicate the reliability of the depth data.

The main advantage of ToF cameras over LiDAR and SV systems is their ability to provide depth data at high rates of sampling. Limitations include the low resolution of the depth images (Table 3.1) and noise associated to depth data. Kazmi et al. (2014) compared several ToF cameras with an SV sensor both indoors and outdoors. They concluded that, although the SV provides higher resolution images and is more robust under sunlight, the ToF cameras have the advantages of lower computational costs and the ability to measure targets with non-uniform textures.

In crop characterization, the successful application of ToF cameras for automatic plant probing and phenotyping has been demonstrated (Klose et al. 2009; Alenya et al. 2013). Plant phenotyping is usually performed manually, and so its automation using ToF cameras reduces labour hours and human error. In this context, Chaivivatrakul et al. (2014) developed an automatic phenotyping system where a ToF camera is integrated in a rotating table. Maize plant measurements were made, estimating the leaf geometry (length, area, angle) and the stem diameter. Likewise, van der Heijden et al. (2012) developed a system for the phenotyping of greenhouse pepper plants by combining RGB and ToF cameras. Strong correlations were obtained between estimated plant heights and manual measurements. For the phenotyping of small grain cereals under field conditions, Busemeyer et al. (2013) presented a multi-sensor platform that integrated various optical sensors (ToF cameras, light curtain imaging, laser distance sensors, hyperspectral and RGB cameras) with a modular architecture adaptable to different species.

Another application of ToF cameras is the automatic sensing of inter-plant spacing at initial growth stages (Nakarmi and Tang 2012). Accurate measurements of inter-plant spacing can contribute to distributing the inputs evenly to different plants. Karkee and Adhikari (2015) used ToF cameras to generate apple tree skeletons. From these 3D reconstructions, a branch identification accuracy of 77 % was achieved. This is the first step in the development of an automatic pruning system.

In 2014, Microsoft launched a second version of the Kinect sensor (hereafter referred to as K2) based on the ToF principle instead of structured-light (Table 3.1). The main difference between the K2 sensor and conventional ToF cameras is the presence of an RGB camera, which provides colour information together with the depth data (like the K1). In experimental tests of the K2, Sarbolandi et al. (2015) concluded that, unlike the K1, it can be used for operation in daylight conditions. This feature is particularly interesting for agricultural applications, although it should be noted that the number of measured points is reduced as the sunlight increases. In addition, the K2 is more accurate and precise than the K1. Fig. 3.2c shows the 3D point cloud of an apple tree obtained using a K2 sensor. Although the point density is less than that provided by LiDAR measurements (Fig. 3.2b), the K2 provides colour information. Compared to SfM (Fig. 3.2a), K2 point clouds are less realistic, but real-time acquisition is possible.

These factors explain why the K2 has become very popular in 3D crop characterization as a low-cost alternative to LiDAR systems. For instance, the K2 was used as the basis for the development of plant-to-plant phenotyping platforms under controlled conditions (McCormick et al. 2016; Sun and Wang, 2019). With respect to field phenotyping, several studies have used the K2 to determine multiple geometric properties of maize plants (Hämmerle and Höfle 2016; Vázquez-Arellano et al. 2018; Bao et al. 2019b), including plant height and orientation, leaf angle, and stem diameter and position, among others.

The K2 was also used for weed detection and control by Andújar et al. (2016a). In this study, 3D measurements were used to separate weeds from maize plants and subsequently obtain weed and maize estimates of biomass. With the ultimate goal of developing a system for robotic weed control, Gai et al. (2020) used the K2 to detect broccoli (Brassica oleracea L. var. botrytis L.) and lettuce (Lactuca L.) at different growth stages. They achieved detection rates of more than 90 % for both crops.

The K2 has also been proposed by Rosell-Polo et al. (2017) as a cost-effective alternative to MTLS for the 3D characterization of orchard crops. These authors developed an MTLS based on the combination of a K2 with an RTK-GNSS system. The system was validated experimentally in a vineyard, obtaining the most accurate 3D point clouds when the field of view was adjusted to one vertical pixel column. Similarly, Bengochea-Guevara et al. (2017) used a mobile platform with an on-board K2 sensor to enable the easy acquisition of information in the field. The approach used was also applied in vineyards and included a method to correct the drift that 3D reconstruction of large crop rows present.

3.5.3 Active Stereo Vision

Recently, new low-cost compact stereo vision sensors, like the Intel RealSense series, have appeared on the market. These devices include a laser projector that emits a known pattern that helps to improve the image matching process. Vit and Shani (2018) compared these sensors with other current SL and ToF cameras and demonstrated that the new active stereo sensors are suitable for outdoor agricultural phenotyping, with the advantage of lower power requirements. Likewise, Milella et al. (2019) developed an automatic grapevine phenotyping system based on a RealSense camera and showed its capacity for grape bunch detection and to estimate the canopy volume.

3.6 Conclusions for the chapter

As with all data acquisition technologies, the expectation is that crop geometry sensing in the future will be cheaper and faster, and that larger data volumes and more accurate results will be possible. For now, as there is no single sensor that can meet all the requirements, the most suitable sensor needs to be chosen for each particular situation. In this respect, an important limitation of photogrammetric sensors is the effect of changes in light conditions. In contrast, LiDAR systems are more reliable but their cost is significantly higher and they do not provide colour information. The RGB-D camera sensors are a low-cost solution that record colour and depth images, but their resolution is lower than that obtained with LiDAR systems or photogrammetry.

At the time of writing, only a limited amount of equipment fitted with ultrasonic sensors, and in some cases LiDAR , is available on the market for application with commercial agricultural machinery. It is currently used only for high value crops or on machines such as those for pesticide application or harvesting extensive field crops The logical next step that would lead to the large-scale incorporation of sensors in agricultural applications is for crop geometry characterization to be sufficiently accurate and fast enough to allow the real-time adjustment of operating parameters of the machine (spray rate, cutting height, etc.).

The other technologies (ToF cameras, SL sensors, active SV , and so on) continue in the research phase, although some commercial applications are possible in the near future. Their practical implementation will generally be determined on a cost-benefit basis. Photogrammetry, for example, has an interesting future because it can be applied with cameras with low resolution and with fewer camera specifications. If the cost of the sensor exceeds the cost of the equipment or machine to which a new capacity must be added, it is clear that it will not be implemented commercially. If, on the other hand, the sensor is used to generate maps to provide information for future field operations, that is something that could be implemented in the near future. In other words, the real-time adjustment of an agricultural operation will be more difficult than map-based applications. Thus, we can anticipate that complex equipment (with the best sensors and processing systems) will be available to produce accurate maps of crops. Then, agricultural machinery that performs specific operations will be able to read these maps and adapt their operations to the needs of the crop. Sensors with less complexity than those initially mounted can be installed to adjust machine operations further, improving the resolution offered by the maps.

In relation to the data obtained from all the sensors described in this chapter, another issue that needs to be improved is the volume of data that can be processed. Dealing with large amounts of data from continuous scanning is a major problem, especially when manual processes are required. For this reason, automated systems to process large and accurate point cloud files are necessary (Colaço et al. 2017). In this context, another aspect that needs to be improved is automatic on-board processing of data through which the final result can be made available as soon as the scanning process ends (Underwood et al. 2016).

Powerful tools to deal with this type of problem exist in the fields of artificial intelligence and big data. In this regard, deep learning for computer vision and data processing has emerged as a family of techniques with great potential in agriculture (Kamilaris and Prenafeta-Boldú 2018). For example, convolutional neural networks (CNNs) can be used to detect, classify or segment vegetative organs (trunks, leaves, fruits, etc.) in 2D images (Jiang and Li 2020). In addition, the simultaneous availability of colour and depth information (multi-modal data), obtained with RGB-D cameras or photogrammetry, can be used to improve the performance of neural networks, as demonstrated for branch and fruit detection applications (Zhang et al. 2018a; Gené-Mola et al. 2019b). Depth data can also be used to project 2D detections on to 3D point clouds (Shi et al. 2019; Gené-Mola et al. 2020b).

A pending challenge is the development of CNN models that allow the direct processing of 3D point clouds. Jin et al. (2020), for example, have implemented a voxel-based convolutional network to separate maize stems and leaves based on the measurements of a terrestrial LiDAR system. Their results outperformed those obtained with traditional clustering methods. However, the application of deep learning techniques in crop geometric characterization needs more publicly-available datasets with multi-modal images of different crops to train the models.

Deciphering the complexity of plant and crop structures has been made possible through the availability of advanced data extraction technologies. As the volumes of data increase and the information becomes ever more complex, increasing the speed at which the data can be extracted and analyzed is the next big challenge for the present and future generations of precision agricultural engineers.