Keywords

1 Introduction

Plant phenotype is an objective expression of plant growth and plays an important role in plant research and agricultural production. In traditional agricultural and forestry research, plant phenotypic analysis relies on manual observation and measurement of plants. This approach is time-consuming, labor-intensive, and relies heavily on subjective experience. The integration of modern information technology and agriculture has promoted the formation of new production modes such as agricultural information intelligent perception, precise monitoring, and quantitative decision-making, which has promoted the development of precision agriculture [1]. Using modern information technology to obtain the phenotype of plants and crops without damage and quantify the growth state of crops is one of the important steps to achieve precision agriculture [2].

With the wide application of computer vision in the field of agriculture and forestry, plant disease detection [3] and fruit maturity judgment [4] can be achieved by using two-dimensional (2D) images to obtain plant information. Since the three-dimensional (3D) model has more information than the 2D image, it is a potential direction to use the 3D point cloud model to monitor and manage agricultural production [5]. Wu et al. [6] used point clouds to generate the skeleton of maize plants, and estimated phenotypes such as plant height, leaf length, leaf inclination, and azimuth. Rueda-ayala et al. [7] estimated the plant height, biomass, and volume of sweet wheat grass and ryegrass through the point cloud of grassland. Cabo et al. [8] automatically identified tree stems by analyzing point clouds, and then estimated tree height and trunk diameter. The plant surface morphological information obtained through point clouds has many uses, such as judging the growth state of plants, predicting crop yield, calculating volume, and so on.

At present, most of the methods for the 3D reconstruction of plants are based on optical non-contact sensors. Wang et al. [9] used the terrestrial laser scanning (TLS) method to obtain the point cloud of corn plants, which is fast and simple to operate, but the scanning equipment is expensive. Vázquez-Arellano et al. [10] used a time-of-flight (TOF) camera to sequentially collect point clouds of corn seedlings in the experimental field for registration, and the registration algorithm used iterative closest point (ICP). The point cloud registration method in this paper relies on point clouds whose initial poses are roughly aligned and overlapped, and the price of the capture device is high. Chen et al. [11] used an RGB camera to take multiple photos of kale, wheat, and physalis placed on a turntable, calculated camera internal parameters and pose information according to incremental SFM, used MVSNet to generate a depth map, and finally generated 3D points cloud. The point cloud generated by this method is dense, but the method using the turntable is only suitable for low plants and not suitable for outdoor scenes. Ni et al. [12] used a binocular camera to capture two images of stereo plants, used the efficient large scale stereo matching (ELAS) algorithm to calculate the disparity map, and then obtained the 3D point cloud model of the plant through triangulation. When processing images were taken at close range, the texture of leaves can be clearly displayed, but the reconstruction effect is poor in the case of long-distance. Guan et al. [13] developed an imaging system consisting of an RGB camera and a PMD camera. Using the DBSCAN algorithm, the point cloud of the soybean plant canopy is extracted from the original single-view point cloud, and the point cloud models based on the side view and the top view are generated. But they are not spliced together, so the phenotypic information is not comprehensive enough.

The methods in the above-mentioned articles are almost all applied to obtain 3D point cloud models of low plants. However, for tall plants and trees, due to the limited field of view of the sensor, the above method cannot obtain a high-integrity point cloud model from top, bottom, and side views. To solve this problem, point clouds captured from multiple angles and efficient point cloud registration algorithms are required.

Among the point cloud registration algorithms, the most classic is the iterative closest point (ICP) proposed by Besl et al. [14]. In this method, the point pair with the closest Euclidean distance is used as the matching point, and the transformation matrix between the two point clouds is calculated by the least square method. Then, the matching points are repeatedly selected and calculated iteratively until the error is less than the threshold, and the transformation matrix is regarded as optimal. However, the disadvantage of this method is that it requires an initial value and is sensitive to noise, and it is easy to encounter local optimization [15].

In order to improve the accuracy of registration, many scholars divide the registration into two steps, that is, first use geometric features to obtain rough transformation, and then use ICP for fine registration. Accurately computing transformation matrices relies on accurate descriptions and correct matching of features [16]. At present, representative local feature algorithms include 3D shape context (3DSC) proposed by Frome et al. [17], the signature of histograms of orientations (SHOT) proposed by Salti et al. [18], point feature histogram (PFH) [19], and fast point feature histogram (FPFH) [20] proposed by Rusu et al.

FPFH is a 3D feature descriptor based on PFH. It uses the normal vector to calculate the relationship between the query point and the points in its neighborhood. It represents the features of the point neighborhood in the form of a histogram. Since FPFH has a fast operation speed, only 33 dimensions, and occupies less operation space, this paper proposes a point cloud registration algorithm based on FPFH features. This method is used to register point clouds from different viewpoints to obtain a more complete point cloud model. The method is applied to the multi-source real plant point cloud with a low overlap rate, and a more complete three-dimensional point cloud model of the plant can be obtained, and the error is at the millimeter level. This method is of great significance to agricultural intelligence. The remainder of the paper is organized as follows. The second section will introduce the detailed method and mathematical model. The third section contains experimental verification and a discussion of the results. Finally, the conclusion of this paper is given in the fourth section.

2 Methodology

The purpose of registration is to unify the two point clouds into the same coordinate system. The two point clouds to be registered are marked as a source point cloud and a target point cloud, respectively. This paper proposes a point cloud registration method based on FPFH: (1) Preprocess the two point clouds, including subsampling the point cloud with a voxel grid and statistical filtering to remove outliers; (2) Estimate each point cloud normals of points and compute their FPFH features. Histogram similarity is evaluated according to Bhattacharyya distance. Iteratively compares the FPFH features in the source point cloud and the target point cloud to obtain an initial set of matching point pairs; (3) Improve the accuracy of matching point pairs. For the special case where multiple points match one point, only the corresponding point with the closest distance is selected. Then, sort the remaining matching point pairs and pick the top matching point pairs. At the same time, the distance threshold is used to filter the points that are too concentrated; (4) The random sample consensus (RANSAC) is used to remove the wrong matching point pairs, and the calculation of singular value decomposition (SVD) is performed on the remaining matching point pairs to obtain the rotation matrix and translation vector. The specific implementation of this process is shown in Fig. 1.

Fig. 1.
figure 1

The pipeline of pairwise point clouds registration.

2.1 Point Normal Estimation

Since a single point cannot reflect the geometric surface features around it, the features of a point need to be considered by combining the neighboring points within a certain range around it. Typically, geometric surface features are represented based on the normals of points within a neighborhood.

The method of calculating the normal vector of the point cloud surface in this paper is to estimate the normal vector for the points in the neighborhood. Converts the problem of solving the normals of a point on the surface of the point cloud into the problem of estimating the normals of the tangent planes. By minimizing the objective function containing the normal vector, using the idea of the least-squares method, the result of the dot product of the vector formed by the point and each of its neighbors and the normal vector is zero.

First, the plane is represented as a point x and a normal vector \(\overrightarrow{n}\), as shown in Fig. 2. The points in the neighborhood are represented by \({p}_{i}\in {P}^{n}\).

Fig. 2.
figure 2

Estimate the normal vector of a point

Calculate the centroid \(\overline{p }\) of the neighborhood points in the given radius \({r}_{n}\) and assign it to \(x\), which can be shown as

$$x=\overline{p }=\frac{1}{n}\sum\nolimits_{i=1}^{n}{p}_{i}.$$
(1)

The vector from the centroid to the neighborhood point is defined as

$${y}_{i}={p}_{i}-\overline{p}.$$
(2)

The distance \({d}_{i}\) from the neighboring point \({p}_{i}\) to the plane is equal to the projection of \(({p}_{i}-x)\) onto the normal vector \(\overrightarrow{n}\), which is expressed as

$${d}_{i}=\left({p}_{i}-x\right)\cdot \overrightarrow{n}.$$
(3)

The least-square plane estimation problem is constructed. To find a plane passing through the centroid \(\overline{p }\) with a normal vector \(\overrightarrow{n}\), the objective function is

$$\underset{x,\overrightarrow{n},\Vert \overrightarrow{n}\Vert =1}{\mathit{min}}\sum\nolimits_{i=1}^{n}{\left[{\left({p}_{i}-x\right)}^{T}\overrightarrow{n}\right]}^{2}.$$
(4)

Substituting the centroid \(\overline{p }\) into the above formula can be simplified to

$$\underset{\overrightarrow{n},\Vert \overrightarrow{n}\Vert =1}{\mathit{min}}{\overrightarrow{n}}^{T}\left(Y{Y}^{T}\right)\overrightarrow{n}.$$
(5)

where \(Y{Y}^{T}=\sum\nolimits_{i=1}^{n}{y}_{i}{{y}_{i}}^{T}\) is the 3 × 3 covariance matrix, represented by \(C\). And by the definition of \({y}_{i}\), there are

$$C=Y{Y}^{T}=\sum\nolimits_{i=1}^{n}{y}_{i}{{y}_{i}}^{T}=\sum\nolimits_{i=1}^{n}\left({p}_{i}-\overline{p }\right){\left({p}_{i}-\overline{p }\right)}^{T}.$$
(6)

The relationship between the eigenvalue \({\lambda }_{e}\) and the eigenvector \(\overrightarrow{{v}_{e}}\) of the covariance matrix is

$$C\overrightarrow{{v}_{e}}={\lambda }_{e}\overrightarrow{{v}_{e}} , e\in \left\{\mathrm{0,1},2\right\}.$$
(7)

The eigenvalue \({\lambda }_{e}\) and its corresponding eigenvector \(\overrightarrow{{v}_{e}}\) are obtained by singular value decomposition. If \(0\le {\lambda }_{0}\le {\lambda }_{1}\le {\lambda }_{2}\), according to the principle of principal component analysis (PCA), the eigenvector \(\overrightarrow{{v}_{0}}\) corresponding to \({\lambda }_{0}\) is approximately in the direction of the normal vector \(\overrightarrow{n}\) or \(-\overrightarrow{n}\).

The ambiguity of normal vector cannot be solved mathematically. Therefore, we set a viewpoint \({V}_{p}\) as the judgment basis to make the standard selection of normal direction consistent [21]. All normal directions should conform to the judgment formula, expressed as

$$\overrightarrow{n}\cdot \left({V}_{p}-{p}_{i}\right)>0.$$
(8)

2.2 The Calculation of FPFH

Both PFH and FPFH construct multi-dimensional histograms to describe point feature information. FPFH is the optimization algorithm of PFH. The main differences are the selection method, the calculation of point features, and the dimension of the histogram. The neighborhood influence range of PFH and FPFH is shown in Fig. 3. When calculating PFH, all of \({p}_{q}\)’s neighbors enclosed in the sphere with a given radius \({r}_{p}\) are selected and the relationship between the pairwise is calculated. The calculation of FPFH is to first calculate the relationship between point \({p}_{q}\) and its neighborhood points within a given radius \({r}_{f}\). And then compute the relationship between these neighborhood points and their neighborhood points. Therefore, the computational complexity of FPFH is lower than that of PFH [20].

Fig. 3.
figure 3

The influence region of PFH (left) and FPFH (right)

The calculation of FPFH first defines the space within the given radius \({r}_{f}\) as the neighborhood of the query point \({p}_{q}\). The query point \({p}_{q}\) is then paired with the points in the neighborhood. In order to represent the relationship between each point pair in the local coordinate system, Darboux coordinate frame corresponding to the point pair is established, as shown in Fig. 4.

Fig. 4.
figure 4

Darboux coordinate frame

Specifically, Darboux coordinate frame \(u-v-w\) is defined as

$$\left\{ \begin{aligned} & u = n_s \\ & v = \frac{{p_t - p_s }}{{\left\| {p_t - p_s } \right\|}} \times u \\ & w = u \times v, \\ \end{aligned} \right.$$
(9)

where \({p}_{s}\) and \({p}_{t}\) are the coordinates of source point and target point, and \({n}_{s}\) and \({n}_{t}\) are their estimated normals respectively; \({p}_{t}-{p}_{s}\) is the vector between the source point and target point. The direction of the normal \({n}_{s}\) is defined as the direction of the \(u\) axis. The direction of the \(v\) axis is the direction of the cross product of \({p}_{t}-{p}_{s}\) and \(u\). The \(w\) axis is the cross product of the \(u\)- and \(v\)- axes. Therefore, \(u-v-w\) is a cartesian coordinate system with three perpendicular axes.

Four features describing the relationship between two points are defined, which reduces the number of values representing the two points and their normals from 12 to 4. They are expressed as

$$\left\{ {\begin{array}{*{20}l} {\alpha = v \cdot n_t } \hfill \\ {\phi = u \cdot \frac{{p_t - p_s }}{d}} \hfill \\ {\theta = \arctan \left( {w \cdot n_t ,u \cdot n_t } \right)} \hfill \\ {d = \left\| {p_t - p_s } \right\|,} \hfill \\ \end{array} } \right.$$
(10)

where \(\alpha\) is the dot product of the normal \({n}_{t}\) and the \(v\) axis, \(\phi\) is the dot product between the \(u\) axis and the normalized \({p}_{t}-{p}_{s}\), \(\theta\) is the angle between the projection of the target normal \({n}_{t}\) on the \(u-v\) plane and the \(u\) axis, \(d\) is the Euclidean distance from \({p}_{s}\) to \({p}_{t}\). In some cases, the fourth feature, \(d\), is not very important when the distance between adjacent points increases from the viewpoint. Therefore, \(d\) is omitted from the calculation of FPFH [20]. These three angular features, {\(\alpha\), \(\phi ,\) \(\theta\)}, adopted by FPFH is spread over as much as possible of the available histogram range without exhibiting a bias for certain regions [22].

The calculation of FPFH of query point \({p}_{q}\) is divided into the following steps: (1) Calculate the \(\alpha\), \(\phi ,\) \(\theta\) of query point \({p}_{q}\), denoted as \(SPFH({p}_{q})\); (2) Calculate the \(\alpha\), \(\phi ,\) \(\theta\) of \(K\) neighborhood points \({p}_{qi} (i=\mathrm{1,2},\cdots , K)\), denoted as \(SPFH({p}_{qi})\); (3) Each of the three angular features has 11 dimensions, so a histogram with a horizontal coordinate of 33 dimensions can be obtained. The \(SPFH({p}_{q})\) and the \(SPFH({p}_{qi})\) with weights are counted into the histogram to obtain the \(FPFH({p}_{q} )\), which is expressed as

$$FPFH\left({p}_{q}\right)=SPFH\left({p}_{q}\right)+\frac{1}{K}\sum\nolimits_{i=1}^{K}\frac{1}{{\omega }_{i}}SPFH\left({p}_{qi}\right).$$
(11)

where \(K\) is the number of neighborhood points of \({p}_{q}\); \({\omega }_{i}\) is the weight value inversely proportional to the Euclidean distance between \({p}_{q}\) and \({p}_{qi}\), which means that the farther the distance between \({p}_{qi}\) and \({p}_{q}\), the less influence \({p}_{qi}\) has on the \(FPFH\left({p}_{q}\right)\).

2.3 Find the Matching Point Pairs

Sample Consensus Initial Alignment (SAC-IA) is a classical and common point cloud registration algorithm that calculates transformation matrix according to local descriptors [20]. The algorithm selects multiple sampling points from the source point cloud, finds one or more similar points in the target point cloud, and then randomly selects a point from the similar points as the corresponding point. According to the different selected point pairs, the transformation matrix is calculated, and an error metric is computed to evaluate the transformation. Finally, a good transformation is found among all the transformations. However, due to the method of selecting matching pairs and the standard of evaluating similarity, the stability of registration is poor in the case of low overlap rate and high noise, and the wrong results may occur. Therefore, the method proposed discards SAC-IA.

In statistics, there are metrics for evaluating the correlation of probability distribution functions, such as Bhattacharyya distance, Hellinger distance, Kullback-Leibler divergence, correlation. Several comparative experiments of these metrics are presented in Sect. 3. Because of the symmetry and lower computational complexity of Bhattacharyya distance, the method proposed uses Bhattacharyya distance as the metric to evaluate FPFH similarity.

For discrete probability distributions \(p\) and \(q\) over the same domain \(X\), Bhattacharyya distance \({D}_{B}\left(p,q\right)\) is defined as

$$D_B \left( {p,q} \right) = - ln \left( {BC\left( {p,q} \right)} \right),$$
(12)

where

$$BC\left( {p,q} \right) = \sum\nolimits_{x \in X} {\sqrt {p\left( x \right)q\left( x \right)} }$$
(13)

The FPFH histogram is essentially a combination of three 11-dimensional histograms. Therefore, when comparing the similarity of FPFH, the similarity of the three histograms is compared respectively. The three Bhattacharyya distances are calculated separately, and their sum is used to evaluate the similarity between the two FPFH. The FPFH similarity of the two points \({p}_{1}\) and \({p}_{2}\) is calculated as

$${D}_{B-FPFH}\left({p}_{1},{p}_{2}\right)={D}_{B-\alpha }\left({p}_{1},{p}_{2}\right)+{D}_{B-\phi }\left({p}_{1},{p}_{2}\right)+{D}_{B-\theta }\left({p}_{1},{p}_{2}\right).$$
(14)

Since the range of BC(p, q) is \([\mathrm{0,1}]\), the range of \({D}_{B}\left(p,q\right)\) is [0, +∞), and the range of \({D}_{B-FPFH}\left({p}_{1},{p}_{2}\right)\) is also [0, +∞). The closer the Bhattacharyya distance is to zero, the more similar the histograms of the two points are, which means the more similar the neighborhood features of the two points are.

The points in the source point cloud respectively find the point with the smallest Bhattacharyya distance from the target point cloud as the matching point pair. These match point pairs form the initial set of matched point pairs.

2.4 Strategies to Improve the Accuracy of Matching Pairs

In order to filter some matching point pairs and improve the estimation accuracy of the transformation matrix, some strategies are added to our algorithm, as shown in Algorithm 1.

In the set of matching point pairs, it is inevitable that multiple points will choose the same point as their most similar point. When this happens, the point that has been selected multiple times is reversed to select its most similar point. The reconstituted match pair is then put into the collection, while the remaining point pairs are deleted.

Then, the matching point pairs are sorted according to distance, and the top matching point pairs are selected. At the same time, the distribution range of selected points is enlarged by setting a distance threshold to eliminate the points whose distribution is too concentrated. Finally, RANSAC is used to weed out incorrect matching pairs. The remaining set of matching point pairs is used for subsequent calculations.

figure a

2.5 Calculation of Transformation

According to the matching point pairs, the singular value decomposition method is used to calculate the transformation between the two point clouds. The matching point pair set after filtering is expressed as

$$P=\left\{{p}_{1},\dots ,{p}_{m}\right\},{P}^{^{\prime}}=\left\{{p}_{1}^{^{\prime}},\dots ,{p}_{m}^{^{\prime}}\right\}.$$
(15)

All point pairs conform to

$${p}_{k}=R{p}_{k}^{^{\prime}}+t.$$
(16)

where \(R\) is the rotation matrix and \(t\) is the translation vector. The error \({e}_{k}\) is constructed as

$${e}_{k}={p}_{k}-\left(R{p}_{k}^{^{\prime}}+t\right).$$
(17)

To obtain \(R\), \(t\), it is necessary to minimize the sum of error squares, which can be expressed as

$$E^2 = \mathop \sum\nolimits_{k = 1}^m \left\| {\left( {p_k - \left( {Rp_k^{\prime} + t} \right)} \right) } \right\|^2.$$
(18)

According to the least-squares fitting of the two point sets \(P\) and \({P}^{\mathrm{^{\prime}}}\), (18) can be simplified as

(19)
$$t=p-R{p}^{^{\prime}},$$
(20)

where p and \({p}^{\mathrm{^{\prime}}}\) are the centroids of the two point sets \(P\) and \({P}^{\mathrm{^{\prime}}}\), respectively; \({q}_{k}\) and \({q}_{k}^{^{\prime}}\)

$${q}_{k}={p}_{k}-p,{q}_{k}^{^{\prime}}={p}_{k}^{^{\prime}}-{p}^{^{\prime}}.$$
(21)

Singular value decomposition is used to estimate \(R\). Finally, \(t\) is solved by substituting \(R\) into Eq. (20).

3 Experiment and Analysis

The experimental data include the public point cloud data and the plant point cloud data captured in the real scene. The experiments were implemented on a computer with 8 GB RAM and an Intel Core i5–10500 CPU. The program was developed based on C++ and a third-party open-source library, Point Cloud Library (PCL), which contains many point cloud basic processing functions.

3.1 Registration Experiment of Public Dataset

In order to verify the accuracy and validity of the method using Bhattacharyya distance, the methods using the similarity criteria of Hellinger distance, Kullback-Leibler divergence, and correlation were compared. The registration results of the public point cloud models from different perspectives are shown in Figs. 5, 6 and 7. The initial input point cloud models are selected from the dataset of Stanford 3D scanning repository.

It can be seen that these methods can generally register point clouds from different perspectives. In comparison, the proposed method has better alignment in details without obvious double shadow or offset. In the registration of Bunny 0° and 45° models, except for the proposed method, the alignment at the ear region is poor. In the registration of Bunny 45° and 90° models, the results of the registration method based on correlation show an overall deviation. In the registration of Armadillo 270° and 300° models, the results of the methods based on Hellinger distance, Kullback-Leibler divergence and correlation do not fit well in the claw and head of the model.

The registration error of each method is then quantitatively evaluated by the root mean square error (RMSE) criterion, as shown in Eq. (22).

$$RMSE = \sqrt {\frac{1}{M}\sum\nolimits_{l = 1}^M {\left\| {Rp_l + t - q_l } \right\|} ^2 }$$
(22)

where \(R\) and \(t\) are the true values of the rotation matrix and translation vector respectively, \({p}_{l}\) and \({q}_{l}\) are the points of the initial point cloud and the point cloud transformed by the registration algorithm respectively, and \(M\) is the number of points in the point cloud.

Fig. 5.
figure 5

Initial input of Bunny 0° and 45° and registration results using methods based on different metrics. (a) Bunny 0° and 45°; (b) Hellinger distance; (c) Kullback-Leibler divergence; (d) Correlation; (e) The proposed method.

Fig. 6.
figure 6

Initial input of Bunny 45° and 90° and registration results using methods based on different metrics. (a) Bunny 45° and 90°; (b) Hellinger distance; (c) Kullback-Leibler divergence; (d) Correlation; (e) The proposed method.

Fig. 7.
figure 7

Initial input of Armadillo 270° and 300° and registration results using methods based on different metrics. (a) Armadillo 270° and 300°; (b) Hellinger distance; (c) Kullback-Leibler divergence; (d) Correlation; (e) The proposed method.

The RMSEs of each experiment are listed in Table 1. It can be seen that the RMSE of the method proposed is lower than that of other methods. The results are consistent with those in Figs. 5, 6 and 7, which illustrates that the proposed method is more accurate and effective when applied to the classic data set.

Table 1. Quantitative evaluation of classical data set registration experiments.

3.2 Registration Experiment of Noisy Plant Point Cloud

In order to verify the accuracy and anti-noise performance of the proposed method in registering low overlap plant point clouds, the proposed method was compared with typical point cloud registration algorithms SAC-IA and normal distributions transform (NDT). Experimental data are plant point clouds captured by RGB camera and modeled by COLMAP which is an open-source 3D reconstruction algorithm. At the same time, Gaussian noise with zero mean and variance of \(3mr\) was added to the point cloud model to make the experiment closer to the real application scenario. The resolution of the point cloud is denoted as \(mr\) and is defined as

$$mr = \frac{1}{n}\sum\nolimits_{i = 1}^n {\left\| {p_i - p_{in} } \right\|} ,$$
(23)

where \({p}_{in}\) is the nearest point of \({p}_{i}\), and \(n\) is the number of points.

The inputs of the experiment and registration results of different methods are shown in Fig. 8. The inputs are point clouds with Gaussian noise, and their overlap rate is about 8%. The comparison of registration results shows that SAC-IA failed to register the two point clouds, and mistakenly spliced the point cloud of the trunk with that of the crown. Although NDT can roughly identify the transformation of the two point clouds, the registration result of the branch part still appears double shadow. The proposed method can register the two point clouds accurately, and the noise resistance and accuracy are better than the other two methods.

The RMSEs obtained from different methods are shown in Table 2. It can be seen that the RMSE obtained by the proposed method is 0.0168, which is the smallest than that of SAC-IA (1.5362) and NDT (0.0558). This indicates that the proposed method can obtain satisfactory registration of low overlapping plant point clouds, where the noise influence is effectively suppressed.

Fig. 8.
figure 8

The input of the experiment and the registration results of plant point clouds with different methods. (a) Input; (b) SAC-IA; (c) NDT; (d) The proposed method.

Table 2. Quantitative results of plant point cloud registration experiment with Gaussian noise.

3.3 Registration Experiment of Multi-source Plant Point Cloud

Multi-source point clouds in real scenes have more noise, and the density between point clouds is diverse, which requires higher discrimination and robustness of registration methods. The experiment in this section is to verify the effectiveness of the proposed method in registering plant point clouds captured by different sensors in real application scenes. Experimental data were captured by the RGB camera of unmanned aerial vehicle (UAV) and mobile phone respectively. As inputs, the point cloud captured on the ground by mobile phone and that captured by UAV is shown in Fig. 9. The overlap between the two point clouds is about 25%.

Fig. 9.
figure 9

The input of the experiment and the registration results of plant point clouds with different methods. (a) Input; (b) SAC-IA; (c) NDT; (d) The proposed method.

The proposed method was compared with SAC-IA and NDT, and the results are also shown in Fig. 9. The registration result of SAC-IA is shown in Fig. 9b. The position relation of the two point clouds is roughly correct, but the rotation angle has deviated. As shown in Fig. 9c, NDT gives a wrong transformation when registering the multi-source plant point cloud. The alignment result of the proposed method is more accurate, as shown in Fig. 9d. The branches in the point cloud are fitted without obvious position and angle deviation. After manual measurement, the error at the maximum offset of the results of the proposed method is less than 1 cm, which is within the acceptable range.

According to the experimental results, in real application scenes, the proposed method can effectively register multi-source plant point clouds with a low overlap rate, and the error is within millimeter level. By splicing the point clouds captured by both terrestrial and airborne methods, more complete tree point clouds can be obtained, which lays a foundation for extracting more plant phenotypes.

4 Conclusion

Aiming at the difficulty of obtaining the complete point cloud model of tall plants, a method of point cloud registration in a real scene is proposed. The FPFH features are calculated according to the normal vector of the point cloud, and the similarity of features is evaluated by Bhattacharyya distance to obtain the initial matching pairs. Then, a filtering algorithm based on RANSAC is used to obtain the matching pairs with high accuracy. Finally, the transformation of the point cloud is obtained by singular value decomposition. The experimental results show that the proposed method is more accurate and helps to improve the success rate of point cloud registration for tall plants.

However, when using the proposed method, different application objects need to have different parameters. Therefore, providing the adaptive ability of parameters in the method will be our future research content.