1 Introduction

Due to the lack of skilled workforce and increasing labor costs, advanced automation is required for greenhouse production systems [1]. Despite intensive R&D on harvesting robots, there are no commercial harvesting robots for sweet peppers [2, 3]. Robotic harvesting of sweet peppers includes several tasks: detecting the fruit, approaching it, deciding whether the fruit is ripe, and finally detaching the fruit from the stem [4, 5]. The major limitation most commonly tackled today is the non-optimal detection rates; Bac et al. [3] reported state of the art being 85% in their 2014 review. Viewpoints analyses in harvesting robotics indicate that only 60% of the fruit can be detected from a single detection direction [6]. Therefore, current research focuses on detection algorithm development [3, 6,7,8]. Another challenge often described in the literature is the task of how to grasp a fruit, due to the limitations of available robotic grippers and the inherent difficulties of grasp planning [9, 10]. Eizicovits and Berman [10] developed geometry-based grasp quality measures based on 3D point cloud to determine the best grasping pose of different objects, including sweet peppers. This kind of solution depends on detailed 3D sensor information of the object [11] which is very difficult to achieve in dense greenhouse environments. These environments have an unstructured and dynamic nature [12]: fruits have a high inherent variability in size, shape, texture, and location; in addition, occlusion and variable illumination conditions significantly influence the detection performance. Given the complexity of both detection and grasp planning tasks, approaching the correct fruit pose must be done dynamically, taking into account obstacles such as stems and leaves. The most common way to do this is visual servoing, i.e. using eye-in-hand sensing to guide the robot towards the fruit by always keeping it in the center of the image [13]. When using this method, it is crucial to choose the best approach direction with least occlusion from leaves and other obstacles to maximize the chance for the visual servoing to reach the desired grasping pose. This research focuses on measuring the value of ideal information regarding the best approach direction for successful visual servoing, compared to a method using a search pattern to find the best direction.

2 Methods

A 6DOF robotic manipulator Fanuc LR Mate 200iD equipped with an eye-in hand iDS Ui-5250RE RGB camera and a Sick DT20HI displacement measurement laser sensor was placed in-front of an artificial plastic pepper crop with yellow plastic fruits and green leaves (Fig. 1). The workflow of the robot was implemented using a generic software framework for development of agricultural and forestry robots [14]. The framework is constructed with a hybrid robot architecture, using a state machine implementing a flowchart as described by Ringdahl et al. [15].

Fig. 1.
figure 1

The experimental setup consisted of a robotic harvester in front of an artificial crop.

A scene consisting of five plastic fruits placed at different locations on two artificial stems was setup before each experiment. The number of fruits were set to 5 to be similar to an actual sweet pepper plant, the right stem had three fruits, the left had two fruits. Each fruit had one or two leaves placed on different side (left/front/right) of it to create occlusion. An example of an overview image taken by the robot can be seen in Fig. 2. For each fruit the best fit harvesting approach, defined as the “optimal” harvesting approach direction, was set as the angle from either left (−45°), front (0°), or right (45°) where the target was least occluded, was noted manually. Figure 3 shows a flowchart describing the decision process for the manual selection.

Fig. 2.
figure 2

An overview image taken from the robot’s camera looking at a laboratorial scene with 5 peppers on two stems covered by leaves.

Fig. 3.
figure 3

Decision flowchart for manually selecting the optimal approach direction to a pepper.

2.1 Harvesting Scenarios

Two harvesting scenarios were tested. The first scenario, the full a-priori knowledge scenario, represents the ground-truth where both position \( P_{i} \left( {x_{i} ,y_{i} ,z_{i} } \right) \) and approach direction \( \theta_{i}^{ * } \) are known for each fruit \( i \). The harvesting cycle consists of approaching a pre-defined overview waypoint \( W_{0} \left( {x,y,z} \right) \), and then selecting each target fruit in order from the list of positions and optimal approach directions of all fruits. The control unit then calculates the path of the robotic manipulator to a waypoint \( W_{i} \left( {x,y,z} \right) \), positioned at a defined distance from fruit \( i \) with respect to the optimal harvesting approach direction and position \( \left( {x_{i} ,y_{i} ,z_{i} ,\theta_{i}^{ * } } \right) \). After reaching the waypoint, a visual servo procedure based on color blob detection and distance measurements received from the laser guides the manipulator towards the target until the end-effector touches the fruit. If the manipulator reaches the target fruit, the harvest of that fruit is marked as successful and the path to the next waypoint is then calculated. In case the fruit was not found or was lost from view while in visual servo, the harvest of the fruit is marked as failed and the path to the next waypoint is calculated. The cycle ends when all fruits have been attempted to be approached. The left part of Fig. 4 shows a flowchart of this harvesting scenario.

Fig. 4.
figure 4

Flowchart describing the two different harvesting scenarios. Left: auto approach direction search scenario. Right: full a-priori knowledge scenario (differences marked with dashed lines).

The second scenario, the auto approach direction search scenario, is a variation of the ground-truth scenario in which the optimal approach direction \( \theta_{i}^{ * } \) is unknown, and therefore must be searched from a list of predefined possible approach directions \( \theta_{1} ..\theta_{k} \). For each target fruit \( i \) and possible approach direction \( \theta_{j} \) the control unit calculates the path of the robotic manipulator to a waypoint \( W_{ij} \left( {x,y,z} \right) \) positioned at a defined distance from the target fruit with respect to \( \theta_{j} \) until the harvest of the fruit is marked as successful or sight of the fruit is lost. If successful, the path to the waypoint \( W_{ij} \) for fruit \( i + 1 \) and \( \theta_{1} \) is calculated. If the fruit was lost during visual servoing, the next approach direction \( \theta_{j + 1} \) is selected. In case all approach directions \( \theta_{1} ..\theta_{k} \) were attempted without being able to reach the fruit, the harvest of the target fruit is marked as failed and the path to the waypoint \( W_{ij} \) for fruit \( i + 1 \) and \( \theta_{1} \) is calculated. The right part of Fig. 4 shows a flowchart of this harvesting scenario.

2.2 Experimental Protocol

Six laboratory scenes with different leaves and optimal approach directions were set up as defined in Table 1. The pose of each pepper was measured by manually moving the robotic arm in the desired approach direction into the position where the gripper touched the fruit, as seen in Fig. 5.

Table 1. Six scenes with different configurations for leaf (L = left, F = front, R = right) and approach direction (−45°, 0°, 45°).
Fig. 5.
figure 5

The pose of each pepper was measured by manually moving the robotic arm in the desired approach direction to the position where the gripper touched the fruit.

A harvesting cycle is performed for each of the defined scenes and scenarios according to the following configurations. Each one of the scenes defined is performed in three possible configurations:

  • Full a-priori knowledge scenario selecting the optimal approach direction from the set {−45°, 0°, 45°}

  • Auto approach direction search scenario with two different search patterns:

    • Side first: \( \theta_{j} = \left[ { - 45^{\circ} , 0^{\circ} , 45^{\circ} } \right] \) (left-center-right)

    • Center first: \( \theta_{j} = \left[ {0^{\circ} , - 45^{\circ} , 45^{\circ} } \right] \) (center-left-right)

Each configuration is performed at 50% and 100% of maximum speed respectively to enable sensitivity analysis in relation to the robot speed. At the end of each harvesting attempt cycle times and the result of the attempt (success/failure) are registered.

2.3 Measures and Statistical Analysis

To evaluate the performance, the following three measures are defined:

  • Pepper harvest time Th is the time it takes from a fruit is selected from the list of fruit poses until the fruit has been successfully harvested (all fruits were harvested in the experiments).

  • Average logarithmic harvest time \( LTh \) as shown in Eq. 1.

    $$ LTh = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {{ \ln }\left( {Th_{i} } \right)} $$
    (1)

    Where \( n \) is the number of successfully harvested fruits.

  • The number of attempted approach directions \( N\theta_{i} \) for fruit \( i \).

In addition to descriptive statistics of the aforementioned measures, the statistical significance of the differences in the value of the measures was measured. The pepper harvest time \( Th \) is analyzed in a form of a log transformed linear regression [16]:

$$ \ln \left( {Th_{i} } \right) = \beta_{0} + \beta_{1} H_{{c_{i} }} + \beta_{2} O_{i} + \beta_{3} V_{R} + \beta_{4} O_{{F_{i} }} + \beta_{5} H_{{c_{i} }} * O_{i} + {\epsilon}_{i} $$
(2)

Where \( H_{{c_{i} }} \) is the harvesting scenario of pepper \( i \), \( O_{i} \) is the number of occluding leaves, \( V_{R} \) is the robot speed, \( O_{{F_{i} }} \) is the front occlusion (1 if the front is occluded, 0 otherwise), and \( \beta_{0} ,\,\beta_{1} ,\,\beta_{2} ,\,\beta_{3} ,\,\beta_{4} ,\,\beta_{5} \) the corresponding weights of the regression to be estimated. Additionally, independence \( \chi^{2} \) test [17] is performed for analyzing the relation between the number of failed approach directions \( N_{{\theta F_{i} }} \) and the harvesting scenario \( H_{{c_{i} }} \).

3 Results

To determine the value of an optimal harvesting approach direction, a total of 180 fruit harvesting attempts were performed on 6 scenes with 5 artificial peppers each, in a set up according to Table 1, with different harvesting scenarios (full a-priori, center first search pattern, and side first search pattern) using two different robot velocities (50% and 100% of maximum). The total average harvest time \( \overline{Th} \) for all combinations was 8.56 s (SD = 3.88). The distribution among the three harvesting scenarios is presented in Fig. 6. The results show roughly 40–45% increase in average harvest time when no a-priori information of the correct harvesting direction is available.

Fig. 6.
figure 6

Average harvesting time as function of the harvesting scenario

Homogeneous subsets Tukey-HSD test show a significant (p-value = 0.011) differences between \( LTh \) (Eq. 1) calculated from the full a-priori and the center first search pattern harvesting scenarios. The difference between \( LTh \) for full a-priori and side first search pattern harvesting scenarios was also significant (p-value = 0.006). The differences between \( LTh \) for the two search patterns were found to be statistically insignificant (p-value = 0.98).

Results of the logarithmic transformed \( ln\left( {Th} \right) \) regression model (Eq. 2) revealed significance for front occlusion (p-value < 0.001) and harvesting scenario (p-value = 0.02). The number of occluding leaves was not found significant (p-value = 0.774) on its own but was borderline significant in an interaction with the harvesting scenario (p-value = 0.098). A profile plot describing the interaction is presented in Fig. 7. It shows that both search patterns have shorter harvesting times for less occluded scenes. It seems that in the full a-priori information scenario it takes slightly less time to harvest in more complicated scenes with higher occlusion then for simpler scenes. However, this difference was found statistically insignificant (p-value = 0.16). The difference between the two robot velocities (50% or 100% of maximum) was found to be insignificant (p-value = 0.155). This can be explained by the visual servoing technique that limits step sizes between images causing the robot not to obtain the maximum speed during this phase. This is needed to provide sufficient time to process image data during visual servoing.

Fig. 7.
figure 7

Profiles plots for occlusion level and search method

From the total of 180 harvesting attempts performed, all 60 approaches (100%) performed with full a-priori information were successful on the first attempt with an average harvesting time of 6.71 s (SD = 3.05). Out of the 120 cycles performed using a search pattern, 76 (63%) were successful on the first attempt with average harvesting time of 6.62 s (SD = 2.78). 30 cycles (25%) were successful on the second attempt with average time of 11.16 s (SD = 5.4) and the remaining 14 cycles (12%) were successful only on the third attempt with average time of 21.34 s (SD = 6.9). The number of highly occluded peppers and partially occluded peppers were roughly the same (46% and 54% respectively). While the average harvesting time increased as a nearly linear function of the number of attempts, the standard deviation also increased for more complex cases requiring more attempts until harvesting. The analysis of the number of approaches performed until successful harvest as function of search pattern method is presented in Fig. 8. It can be seen that about 30% more fruits were harvested at the first attempt using the side first search pattern than the center first pattern. An independence \( \chi^{2} \) test showed border line significant dependences between the search methods and the number of attempts (p-value 0.0978).

Fig. 8.
figure 8

Number of approaches until successful harvest as function of the search pattern method

4 Conclusions

Results show significant increase in harvesting times for a search pattern compared to ideal initial information about the harvesting direction. The harvesting time grows near linearly with the number of approaches required until successful harvest. Furthermore, the variability of the harvesting time grows with the number of approaches required, causing lower ability to predict harvesting times. Therefore, it is clear that ideal information about the best harvesting approach direction is valuable for increasing the performance of a robot harvesting system.

The harvesting time does not significantly differ for the two different harvesting direction search patterns. This should be validated on a greater variation of search patterns and in greenhouse conditions where the occlusion is less likely to appear in a random manner as designed in the given experiment. To see how this depends on the kind of robot used, validating the results using a robot with different kinematic setup would also be beneficial. It has been shown that if there is an occlusion of the front of a fruit the harvesting times significantly increase compared to fruits that can be harvested from front, regardless of search method. The major reason for this is the limited workspace of the robot; the distance to the fruits is around 35–40 cm, with leaves often being even closer, and the gripper mounted on the end of the robot is 24 cm long. This makes it difficult to reach positions to the side of the peppers and the paths often become quite long due to the limited space and the joint limitations of the robot. Pruning techniques used for crops optimization might take this into consideration to facilitate robotic harvesting.

30% more fruits were harvested at the first attempt when using the side first search pattern than when using the center first pattern. Equal number of scene configurations had fruits blocked by leaves from left and center, therefore the number of approaches would have been expected to be equal for both search patterns. A probable explanation is that some fruits were detected during visual servoing even though they were (partly) blocked by leaves and therefore should not have been possible to harvest. This occurred in 26% of all attempts of harvest from the left and in 13% of all attempts from the front. However, this most likely did not affect the reported recall and precision since they are calculated in comparison to actual harvest approach success rates, i.e. that the robot actually reached the fruit.

The results of this research have shown significant factors affecting harvesting times and success rates in laboratorial conditions. Suggested validation of the results is to perform experiments in greenhouse conditions, which must be done during the growing season when ripe fruits are available.