Abstract

Object retrieval and classification in point cloud data is challenged by noise, irregular sampling density and occlusion. To address this issue, we propose a point pair descriptor that is robust to noise and occlusion and achieves high retrieval accuracy. We further show how the proposed descriptor can be used in a 4D convolutional neural network for the task of object classification. We propose a novel 4D convolutional layer that is able to learn class-specific clusters in the descriptor histograms. Finally, we provide experimental validation on 3 benchmark datasets, which confirms the superiority of the proposed approach.

Paper preview

Overview of the pipeline Overview of the pipeline Overview of the pipeline Overview of the pipeline Overview of the pipeline Overview of the pipeline Overview of the pipeline Overview of the pipeline

For full-text of the paper, see IEEE version. Preprint available at Arxiv

First version of the code can be found at code folder

Main contributions

  1. We present a novel 4D convolutional neural network architecture that takes a 4D descriptor as input and outperforms existing deep learning approaches on realistic point cloud datasets.

  2. We design a handcrafted point pair function-based 4D descriptor that offers high robustness for realistic noisy point cloud data.

Overview of the pipeline

Overview of the pipeline Fig. Overview of the proposed object classification pipeline that is a combination of a novel handcrafted descriptor and a 4D convolutional neural network (CNN). Here, FC denotes a fully connected layer.

Neural network architecture details Fig. Architecture of the proposed 4D neural network.

Results

TABLE I. Retrieval performance of the handcrafted descriptors. The mean value is given in the corresponding column, while the standard deviation is given in brackets. Best performance is shown in bold.

Dataset Metric Descriptor
OUR-CVFH ESF Wahl EPPF Short EPPF
Stanford Total accuracy (%) 62.79 71.34 75.13 77.26 80.18
Mean accuracy (%) 42.91 54.54 57.00 60.53 64.01
Mean recall (%) 49.90 52.28 57.45 60.16 64.58
F1-score 0.437 0.530 0.567 0.601 0.640
ScanNet Total accuracy (%) 56.23 53.41 63.72 63.49 65.29
Mean accuracy (%) 39.83 33.69 45.40 42.02 44.95
Mean recall (%) 38.21 32.72 45.94 45.17 47.54
F1-score 0.382 0.327 0.444 0.430 0.457
M40 Total accuracy (%) 53.22 65.87 74.41 73.00 73.68
Mean accuracy (%) 46.43 58.91 67.50 65.79 66.43
Mean recall (%) 49.26 59.96 70.33 69.12 69.79
F1-score 0.465 0.588 0.680 0.666 0.671

Table II. Classification performance of deep learning approaches using 2D, 3D and 4D convolutional layers.

Dataset Metric PointNet EPPF 2D EPPF 3D EPPF 4D
Stanford Total accuracy (%) 64.30 82.01 81.94 83.22
Mean accuracy (%) 42.48 64.26 66.37 65.11
Mean recall (%) 40.47 70.88 60.94 72.13
F1-score 0.395 0.652 0.665 0.672
ScanNet Total accuracy (%) 63.04 70.39 70.57 72.10
Mean accuracy (%) 37.50 38.98 44.35 45.70
Mean recall (%) 19.53 63.52 54.53 56.58
F1-score 0.209 0.433 0.472 0.488
M40 Total accuracy (%) 87.01 81.64 81.15 82.13
Mean accuracy (%) 82.08 76.37 75.87 77.05
Mean recall (%) 83.48 77.30 77.51 76.99
F1-score 0.824 0.765 0.762 0.769

Retrieval results

Fig. Descriptor and 4D neural network responses for the object table in the ScanNet dataset. Left: descriptor values. Middle: response of the first filter in the first layer. Right: filter response in the second layer. The rows show slices of the fourth dimension. Transparent bins correspond to constant offset values for the response (or 0 for the descriptor values), colored bins - to varying values. The bins are colored so that low values are shown in blue color, while high in red.

References

  1. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

  2. E. Wahl, U. Hillenbrand, and G. Hirzinger, “Surflet-pair-relation histograms: a statistical 3d-shape representation for rapid classification,” in Proceedings of IEEE International Conference on 3-D Digital Imaging and Modeling (3DIM), 2003, pp. 474–481.

Contact

For any questions or inquiries, please contact Dmytro Bobkov at Email with a subject “Object Descriptor RAL”.

Last updated 24.04.2018