EEP-Net: Enhancing Local Neighborhood Features and Efficient Semantic Segmentation of Scale Point Clouds

Liu, Yicheng; Wu, Fuxiang; Zhang, Qieshi; Ren, Ziliang; Chen, Jun

doi:10.1007/978-3-031-18913-5_9

Yicheng Liu¹⁵,
Fuxiang Wu^15,16,17,
Qieshi Zhang^15,16,17,
Ziliang Ren¹⁸ &
…
Jun Chen^15,16,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13536))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1783 Accesses

Abstract

Semantic segmentation of point clouds at the scene level is a challenging task. Most existing work relies on expensive sampling techniques and tedious pre- and post-processing steps, which are often time-consuming and laborious. To solve this problem, we propose a new module for extracting contextual features from local regions of point clouds, called EEP module in this paper, which converts point clouds from Cartesian coordinates to polar coordinates of local regions, thereby Fade out the geometric representation with rotation invariance in the three directions of XYZ, and the new geometric representation is connected with the position code to form a new spatial representation. It can preserve geometric details and learn local features to a greater extent while improving computational and storage efficiency. This is beneficial for the segmentation task of point clouds. To validate the performance of our method, we conducted experiments on the publicly available standard dataset S3DIS, and the experimental results show that our method achieves competitive results compared to existing methods.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Semantic segmentation of large-scale point clouds with neighborhood uncertainty

Article 28 December 2023

PointAF: A Novel Semantic Segmentation Network for Point Cloud

FFA-Net: fast feature aggregation network for 3D point cloud segmentation

Article 30 July 2023

Keywords

1 Introduction

In recent years, image recognition technology has developed rapidly relying on deep learning. In addition to 2D image recognition, people are increasingly interested in 3D vision, because directly learning 3D tasks from acquired 2D images always has certain limitations. With the rapid development of 3D acquisition technology, the availability and value of 3D sensors are increasing, including various types of 3D scanners, lidar, and RGB-D cameras. The advent of large-scale high-resolution 3D datasets with scale information has also brought about the context of using deep neural networks to reason about 3D data. As a common 3D data format, the point cloud retains the original geometric information in three-dimensional space, so the point cloud is called the preferred data form for scene understanding. Efficient semantic segmentation of scene-level 3D point clouds has important applications in areas such as autonomous driving and robotics. In recent years, the pioneering work PointNet has become the most popular method for directly processing 3D point clouds. The PointNet [1] architecture directly processes point clouds through a shared MltiLayer Perceptron (MLP), using the MLP layer to learn the features of each point independently, using maximum pooling to obtain global features. On the other hand, since PointNet learns the features of each point individually, it ignores the local structure between points. Therefore, to improve this, the team introduced PointNet++ [2], as the core of this network hierarchy, with an ensemble abstraction layer consisting of three layers: a sampling layer, a grouping layer, and a PointNet-based learning layer. By overlaying several ensemble abstraction layers, this network learns features from local geometric structures and abstracts local features layer by layer. PointWeb [3] connects all local point pairs using local geographical context, and finally forms a local fully connected network, and then adjusts point features by learning point-to-point features. This strategy can enrich the point features of the local region and form aggregated features, which can better describe the local region and perform 3D recognition.

The effect of these methods in processing small-scale point clouds is gratifying, but it will bring some limitations to processing scene-level point clouds, mainly because: 1) High computational volume and low storage efficiency caused by the sampling method. 2) Most existing local feature learners usually rely on computationally expensive kernelization or graph construction, and thus cannot handle large numbers of points. 3) Existing learners have limited acceptance and size to effectively capture complex structures, and do not capture enough local area features for large-scale point clouds, RandLA-Net [4] provides us with a solution to these problems. RandLA-Net [4] based on the principle of simple random sampling and effective local feature aggregator, can increase the sampling rate while gradually increasing the receptive field of each neural layer through the feature aggregation module to help effectively learn complex local structures. However, after research, it was found that RandLA-Net [4] did not pay attention to the relationship between each point and point in a neighborhood when learning the local structure.

Our main contributions are as follows:

As the input point cloud is direction-sensitive, we propose a new local space representation that is rotationally invariant in the X-Y-Z axis, which can effectively improve the performance of point cloud segmentation.
We propose a new local feature aggregation module, Local Representation of Rotation Invariance (LRRI) , which connects the spatial representation with rotation invariance in X-Y-Z axis to the local relative point position representation to form a new local geometric representation that effectively preserves local geometric details.
We perform experimental validation on a representative S3DIS dataset, and our method is compared with state-of-the-art methods and achieves good performance.

2 Related Work

The goal of point cloud semantic segmentation is to give a point cloud and divide it into subsets according to the semantics of the points. There are three paradigms for semantic segmentation: projection-based, discretization-based and point-based.

2.1 Projection-Based Methods

To leverage the 2D segmentation methods, many existing works aim to project 3D point clouds into 2D images and then process 2D semantic segmentation. By which conventional convolution of 2D images can be used to process point cloud data, to solve target detection and semantic segmentation tasks. There are two main categories of such methods: (1) multi-viewpoint representation [5,6,7,8]. (2) spherical representation [9,10,11,12]. In general, the performance of multi-viewpoint segmentation methods is sensitive to viewpoint selection and occlusion. In addition, these methods do not fully utilize the underlying geometric and structural information because the projection step inevitably introduces information loss; the spherical projection representation retains more information than the single-view projection and is suitable for LiDAR point cloud labeling, however, this intermediate representation also inevitably introduces problems such as dispersion errors and occlusion.

2.2 Discretization-Based Methods

Discretization-based methods, which voxelized point clouds into 3D meshes and then apply the powerful 3D CNN in [13,14,15,16,17]. But the performance of these methods is sensitive to the granularity of voxels, and voxelization itself introduces discretization artifacts. On the other hand, the main limitation of such methods is their large computational size when dealing with large-scale point clouds. This method is very important in practical applications when choosing a suitable grid resolution.

2.3 Point-Based Methods

Different with the first two methods, point-based networks act directly on irregular point clouds. However, point clouds are disordered and unstructured, so standard CNNs cannot be used directly. For this reason, the paper [1] proposes the pioneering network PointNet. The irregular format and envelope invariance of the point set are discussed, and a network that uses point clouds directly is proposed. The method uses a shared MLP as the basic unit of its network. however, the point-like features extracted by the shared MLP cannot capture the local geometric structure and interactions between points in the point cloud. So PointNet++ [2] not only considers global information, but also extends PointNet [1] with local details of the farthest sampling and grouping layers. Although PointNet++ [2] makes use of the local environment, using only maximum pooling may not aggregate information from local regions well. For better access to contextual features and geometric structures, some works try to use graph networks [18,19,20] and Recurrent Neural Networks (RNN) [21,22,23] to implement segmentation. The article [4] proposes an efficient lightweight network Rand-LA for large-scale point cloud segmentation, which utilizes random sampling and achieves very high efficiency in terms of memory and computation, and proposes a local feature aggregation module to capture and preserve geometric features.

3 EEP-Net

In this section, we discuss the EEP module for large-scale point cloud segmentation, which mainly consists of two blocks: Local Representation of Rotation Invariance (LRRI), Attentive Pooling (AP). Then we introduce EEP-Net, which is an encoder-decoder network structure with EEP modules.

3.1 Architecture of EEP Module

The architecture of the EEP module is shown in Fig. 1. Given a point cloud P and the features of each point (including spatial information and intermediate learned features), the local features of each point can be learned efficiently using two blocks, LPPR and AP, as shown in Fig. 1. It shows how the local features of a point are learned and applied to each point in parallel. The local space representation constructed by LPPR with XYZ axis rotation invariance is automatically integrated by AP, and we perform LPPR/AP operation twice for the same point to obtain the information of K-squared neighboring points, which can significantly increase the perceptual field of each point and obtain more information. The final output of this module learns the local features with XYZ axis rotation invariance.

Local Representation of Rotation Invariance (LRRI). As a geometric object, the learned representation of a point set should be invariant to rotation transformations. Points rotated together should not modify the global point cloud category, nor the segmentation of points. In many real scenes, such as the common chairs in indoor scenes as shown in the figure below, the orientations of objects belonging to the same category are usually different. Further, it can be clearly understood that the same object is not only represented by the rotation invariance of the Z-axis (Figs. 2(d)(e)), the X-axis and the Y-axis also have certain rotation invariance. To address this issue, we propose to learn a rotation-invariant local representation, which utilizes polar coordinates to locally represent individual points, and the overall structure of LRRI is shown in the figure. As shown in Figure, local spatial information is input into the LRRI block and the output is a local representation with rotationally invariant features in the X, Y, and Z axes. LRRI includes the following steps: Finding neighboring points: For the point $P_i$, the neighboring points are collected by the K-Nearest Neighbors (KNN) algorithm based on the point-by-point Euclidean distance to improve the efficiency of local feature extraction. Representation of local geometric features in two coordinate systems:

(a) Local geometric representation based on polar coordinates: for the nearest K points $P_1, P_2, P_3,$ $\dots , P_k$ of the center point $P_i$, we use the X, Y, Z of each point (based on the Cartesian coordinate system) to convert to the polar representation of each point, and then subtract the polar representation of the neighboring points and the center point to obtain the local geometric representation based on polar coordinates, the specific operation is as follows:

1) Local representation is constant for Z-axis rotation:

$$\begin{aligned} \alpha _{i}^{k}=\arctan \left( \frac{y_{i}^{k}}{x_{i}^{k}}\right) -\arctan \left( \frac{y_{i m}}{x_{i m}}\right) , \end{aligned}$$

(1)

$$\begin{aligned} \beta _{i}^{k}=\arctan \left( \frac{z_{i}^{k}}{\sqrt{x_{i}^{k^{2}}+y_{i}^{k^{2}}}}\right) -\arctan \left( \frac{z_{i m}}{\sqrt{x_{i m}^{2}+y_{i m}^{2}}}\right) . \end{aligned}$$

(2)

2) Local representation is constant forX-axis rotation:

$$\begin{aligned} \gamma _{i}^{k}=\arctan \left( \frac{z_{i}^{k}}{y_{i}^{k}}\right) -\arctan \left( \frac{z_{i m}}{y_{i m}}\right) , \end{aligned}$$

(3)

$$\begin{aligned} \boldsymbol{\delta }_{i}^{k}=\arctan \left( \frac{x_{i}^{k}}{\sqrt{z_{i}^{k^{2}}+y_{i}^{k^{2}}}}\right) -\arctan \left( \frac{x_{i m}}{\sqrt{z_{i m}^{2}+y_{i m}^{2}}}\right) . \end{aligned}$$

(4)

3) Local representation is constant for Y-axis rotation:

$$\begin{aligned} \epsilon _{i}^{k}=\arctan \left( \frac{x_{i}^{k}}{z_{i}^{k}}\right) -\arctan \left( \frac{x_{i m}}{z_{i m}}\right) , \end{aligned}$$

(5)

$$\begin{aligned} \zeta _{i}^{k}=\arctan \left( \frac{y_{i}^{k}}{\sqrt{x_{i}^{k^{2}}+z_{i}^{k^{2}}}}\right) -\arctan \left( \frac{y_{i m}}{\sqrt{x_{i m}^{2}+z_{i m}^{2}}}\right) . \end{aligned}$$

(6)

(b) Relative point position encoding: For each of the nearest K points of the centroid $P_i$, we encode the location of the points as follows:

$$\begin{aligned} r_{i}^{k}=MLP\left( P_{i} \oplus P_{i}^{k} \oplus \left( P_{i}-P_{i}^{k}\right) \oplus \Vert P_{i}-P_{i}^{k}||\right) . \end{aligned}$$

(7)

(c) Point Feature Augmentation: The enhanced local geometric representation of a point can be obtained by concatenating the relative position codes of adjacent points and the representation of local geometric features in their corresponding two coordinate systems.

$$\begin{aligned} \widehat{f_{\imath }^{k}}=M L P\left( \alpha \oplus \beta \oplus \gamma \oplus \delta \oplus \zeta \oplus r_{i}^{k}\right) . \end{aligned}$$

(8)

Attentive Pooling (AP). In the previous section, we have given the point cloud local geometric feature representation, most of the existing work for aggregating neighboring features uses max/mean pooing, but this approach leads to most of the information loss, we are inspired by SCF-Net [29] network, our attention pooling consists of the following steps:

(a) Calculate the distance: point features and local geometric features generated by the LRRI block, and neighboring point geometric distances are input to the AP module to learn the contextual features of the local region. We want to express the correlation between points by distance, the closer the distance, the stronger the correlation. Two distances are calculated: geometric distance between points and feature distance between point features:

$$\begin{aligned} d_{i f}^{k}={\text {mean}}(|v(i)-v(k)|), \end{aligned}$$

(9)

$$\begin{aligned} \textbf{d}_{\textbf{i}}^{\textbf{k}}=\exp \left( -d_{i g}^{k}\right) \oplus \lambda \exp \left( -d_{i f}^{k}\right) . \end{aligned}$$

(10)

(b) Calculate the attention score: use a shared MLP to learn the attention score of each feature:

$$\begin{aligned} \textbf{A}_{\textbf{i}}^{\textbf{k}}={\text {softmax}}\left( M L P\left( \textbf{d}_{\textbf{i}}^{\textbf{k}} \oplus \textbf{f}_{\textbf{i}}^{\textbf{k}}\right) \right) . \end{aligned}$$

(11)

(c) Weighted sum: use the learned attention scores to calculate the weighted sum of neighboring point features to learn important local features:

$$\begin{aligned} F_{i}^{k}=MLP\left( f_{i}^{k} \oplus \hat{f}_{i}^{k}\right) , \end{aligned}$$

(12)

$$\begin{aligned} \hat{F}_{i}=\sum _{k=1}^{K}\left( F_{i}^{k} \cdot A_{i}^{k}\right) . \end{aligned}$$

(13)

To summarize: given the input point cloud, for the i-th point $P_i$, we learn to aggregate the local features of its K nearest points through two blocks LPPI and AP, and generate a feature vector.

3.2 Global Feature (GF)

To improve the reliability of segmentation, in addition to learning locally relevant features, we borrowed the GF module from SCF-Net to complement the global features. The relationship between position and volume ratio is used.

$$\begin{aligned} B_{i}=\frac{V_{i}}{V_{g}} \end{aligned}$$

(14)

where $B_i$ is the volume of the neighborhood’s bounding sphere corresponding to $P_i$, and is the volume of the bounding sphere of the point cloud.

$$\begin{aligned} \textbf{f}_{\textbf{i} \textbf{G}}=M L P\left( \left( x_{i}, y_{i}, z_{i}\right) \oplus B_{i}\right) \end{aligned}$$

(15)

The x-y-z coordinate of $P_i$ is used to represent the location of the local neighborhood. Therefore, the global contextual features are defined as $f_{iG}$.

3.3 Architecture of EEP-Net

In this section, we embed the proposed EEP module into the widely used encoder-decoder architecture, resulting in a new network we named EEP-Net, as shown in Fig. 3. The input of the network is a point cloud of size $N$ $\times $ $d$, where N is the number of points and d is the input feature dimension. The point cloud is first fed to a shared MLP layer to extract the features of each point, and the feature dimension is uniformly set to 8. We use five encoder-decoder layers to learn the features of each point, and finally three consecutive fully connected layers and an exit layer are used to predict the semantic label of each point.

4 Experiments

In this section, we evaluate our EEP-Net on a typical indoor field attraction cloud benchmark dataset S3DIS. S3DIS is a large-scale indoor point cloud dataset, which consists of point clouds of 6 areas including 271 rooms. Each point cloud is a medium-sized room, and each point is annotated with one of the semantic labels from 13 classes. Our experiments are performed on Tensorflow (2.1.0). We also report the corresponding results of 8 methods on S3DIS. In addition, after verifying the effectiveness of each module, we focus on ablation experiments on Area 5 of S3DIS.

4.1 Evalution on S3DIS Dataset

We performed six cross-validations to evaluate our method, using mIoU as the criterion, the quantitative results of all reference methods are shown in Table 1, our method mIoU outperforms all other methods on this metric, and achieves the best performance on 2 categories, including clut and sofa. Also near the best performance in other categories. Figure 4 shows the visualization results of a typical indoor scene, including an office and a conference room. In generally, the semantic segmentation of indoor scenes is difficult, and the whiteboard on the white wall is easily confused with the white wall itself, but our network can still identify it more accurately.

Table 1. Quantitative results of different methods on S3DIS and the classwise metric is IoU (%).

Full size table

4.2 Ablation Study

The experimental results on the S3DIS dataset validate the effectiveness of our proposed method, and in order to better understand the network, we evaluate it and conduct the following experiments, which will be performed on Area 5, the location of the S3DIS dataset, for this set of experiments. As shown in Fig. 5, it is easy to see that the segmentation result of (d) is obviously closer to the ground truth than (c), which proves that the performance of EEP-Net is better than that of the network containing only Z-axis rotation invariance Table 2.

Table 2. The mean IoU scores of all ablated networks based on our full EEP-Net.

Full size table

5 Conclusion

In this paper, to better learn local contextual features of point clouds, we propose a new local feature aggregation module EEP, which works by representing point clouds from Cartesian coordinates to polar coordinates of local regions. To verify the effectiveness of the method, we conduct experiments on the representative dataset S3DIS, and compare with eight methods to verify the advanced nature of our method. And we conduct ablation experiments on Area 5 of S3DIS to verify the effectiveness of EEP-Net.

References

Qi, C., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85 (2017)
Google Scholar
Qi, C., Yi, L., Su, H., Guibas, L.J.: PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Conference and Workshop on Neural Information Processing Systems (NIPS) (2017)
Google Scholar
Zhao, H., Jiang, L., Fu, C.W., Jia, J.: PointWeb: Enhancing local neighborhood features for point cloud processing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5560–5568 (2019)
Google Scholar
Hu, Q., et al.: RandLA-Net: Efficient semantic segmentation of large-scale point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11105–11114 (2020)
Google Scholar
Lawin, F.J., Danelljan, M., Tosteberg, P., Bhat, G., Khan, F.S., Felsberg, M.: Deep projective 3D semantic segmentation. In: International Conference on Computer Analysis of Images and Patterns (CAIP) (2017)
Google Scholar
Boulch, A., Saux, B.L., Audebert, N.: Unstructured point cloud semantic labeling using deep segmentation networks. In: Eurographics Workshop on 3D Object Retrieval (3DOR)@Eurographics (2017)
Google Scholar
Audebert, N., Le Saux, B., Lefèvre, S.: Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10111, pp. 180–196. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54181-5_12
Chapter Google Scholar
Tatarchenko, M., Park, J., Koltun, V., Zhou, Q.Y.: Tangent convolutions for dense prediction in 3D. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3887–3896 (2018)
Google Scholar
Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: Convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D Lidar point cloud. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1887–1893 (2018)
Google Scholar
Wu, B., Zhou, X., Zhao, S., Yue, X., Keutzer, K.: Squeezesegv 2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a Lidar point cloud. In: International Conference on Robotics and Automation (ICRA), pp. 4376–4382 (2019)
Google Scholar
Landola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and $<$1mb model size. ArXiv abs/1602.07360 (2016)
Google Scholar
Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: RangeNet ++: Fast and accurate Lidar semantic segmentation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220 (2019)
Google Scholar
Meng, H.Y., Gao, L., Lai, Y.K., Manocha, D.: VV-Net: Voxel VAE-Net with group convolutions for point cloud segmentation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8499–8507 (2019)
Google Scholar
Rethage, D., Wald, J., Sturm, J., Navab, N., Tombari, F.: Fully-convolutional point networks for large-scale point clouds. ArXiv abs/1808.06840 (2018)
Google Scholar
Huang, J., You, S.: Point cloud labeling using 3D convolutional neural network. In: 23rd International Conference on Pattern Recognition (ICPR), pp. 2670–2675 (2016)
Google Scholar
Tchapmi, L.P., Choy, C.B., Armeni, I., Gwak, J., Savarese, S.: SEGCloud: semantic segmentation of 3D point clouds. In: International Conference on 3D Vision (3DV), pp. 537–547 (2017)
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) vol. 39(4), pp. 640–651 (2017)
Google Scholar
Landrieu, L., Simonovsky, M.: Large-scale point cloud semantic segmentation with superpoint graphs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4558–4567 (2018)
Google Scholar
Landrieu, L., Boussaha, M.: Point Cloud oversegmentation with graph structured deep metric learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7432–7441 (2019)
Google Scholar
Wang, L., Huang, Y., Hou, Y., Zhang, S., Shan, J.: Graph attention convolution for point cloud semantic segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10288–10297 (2019)
Google Scholar
Ye, X., Li, J., Huang, H., Du, L., Zhang, X.: 3D recurrent neural networks with context fusion for point cloud semantic segmentation. In: European Conference on Computer Vision (ECCV), pp. 403–417 (2018)
Google Scholar
Huang, Q., Wang, W., Neumann, U.: Recurrent slice networks for 3D segmentation of point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2626–2635 (2018)
Google Scholar
Liu, F., et al.: 3DCNN-DQN-RNN: A deep reinforcement learning framework for semantic parsing of large-scale 3D point clouds. In: IEEE International Conference on Computer Vision (ICCV), pp. 5679–5688 (2017)
Google Scholar
Huang, Q., Wang, W., Neumann, U.: Recurrent slice networks for 3D segmentation of point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2626–2635 (2018)
Google Scholar
Landrieu, L., Simonovsky, M.: Large-scale point cloud semantic segmentation with superpoint graphs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4558–4567 (2018)
Google Scholar
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: Convolution on x-transformed points. In: Conference and Workshop on Neural Information Processing Systems (NIPS), pp. 820–830 (2018)
Google Scholar
Zhang, Z., Hua, B.S., Yeung, S.K.: ShellNet: Efficient point cloud convolutional neural networks using concentric shells statistics. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1607–1616 (2019)
Google Scholar
Xie L., Furuhata T., Shimada K.: Multi-resolution graph neural network for large-scale point cloud segmentation. arXiv preprint arXiv:2009.08924 (2020)
Fan, S., Dong, Q., Zhu, F., Lv, Y., Ye, P., Wang, F.: SCF-Net: Learning spatial contextual features for large-scale point cloud segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14499–14508 (2020)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos.U21A20487, U1913202, U1813205), CAS Key Technology Talent Program, Shenzhen Technology Project (nos. JSGG20191129094012321, JCYJ20180507182610734)

Author information

Authors and Affiliations

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yicheng Liu, Fuxiang Wu, Qieshi Zhang & Jun Chen
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China
Fuxiang Wu, Qieshi Zhang & Jun Chen
The Chinese University of Hong Kong, Hong Kong, China
Fuxiang Wu, Qieshi Zhang & Jun Chen
DongGuan University of Technology, Dongguan, China
Ziliang Ren

Authors

Yicheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fuxiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qieshi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ziliang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qieshi Zhang .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Wu, F., Zhang, Q., Ren, Z., Chen, J. (2022). EEP-Net: Enhancing Local Neighborhood Features and Efficient Semantic Segmentation of Scale Point Clouds. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-18913-5_9
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics