Abstract

Object retrieval and classification in point cloud data is challenged by noise, irregular sampling density and occlusion. To address this issue, we propose a point pair descriptor that is robust to noise and occlusion and achieves high retrieval accuracy. We further show how the proposed descriptor can be used in a 4D convolutional neural network for the task of object classification. We propose a novel 4D convolutional layer that is able to learn class-specific clusters in the descriptor histograms. Finally, we provide experimental validation on 3 benchmark datasets, which confirms the superiority of the proposed approach.

Paper preview

For full-text of the paper, see IEEE version. Preprint available at Arxiv

First version of the code can be found at code folder

Main contributions

We present a novel 4D convolutional neural network architecture that takes a 4D descriptor as input and outperforms existing deep learning approaches on realistic point cloud datasets.
We design a handcrafted point pair function-based 4D descriptor that offers high robustness for realistic noisy point cloud data.

Overview of the pipeline

Overview of the pipeline Fig. Overview of the proposed object classification pipeline that is a combination of a novel handcrafted descriptor and a 4D convolutional neural network (CNN). Here, FC denotes a fully connected layer.

Neural network architecture details Fig. Architecture of the proposed 4D neural network.

Results

TABLE I. Retrieval performance of the handcrafted descriptors. The mean value is given in the corresponding column, while the standard deviation is given in brackets. Best performance is shown in bold.

Dataset	Metric	Descriptor
Dataset	Metric	OUR-CVFH	ESF	Wahl	EPPF Short	EPPF
Stanford	Total accuracy (%)	62.79	71.34	75.13	77.26	80.18
	Mean accuracy (%)	42.91	54.54	57.00	60.53	64.01
	Mean recall (%)	49.90	52.28	57.45	60.16	64.58
	F1-score	0.437	0.530	0.567	0.601	0.640
ScanNet	Total accuracy (%)	56.23	53.41	63.72	63.49	65.29
	Mean accuracy (%)	39.83	33.69	45.40	42.02	44.95
	Mean recall (%)	38.21	32.72	45.94	45.17	47.54
	F1-score	0.382	0.327	0.444	0.430	0.457
M40	Total accuracy (%)	53.22	65.87	74.41	73.00	73.68
	Mean accuracy (%)	46.43	58.91	67.50	65.79	66.43
	Mean recall (%)	49.26	59.96	70.33	69.12	69.79
	F1-score	0.465	0.588	0.680	0.666	0.671

Table II. Classification performance of deep learning approaches using 2D, 3D and 4D convolutional layers.

Dataset	Metric	PointNet	EPPF 2D	EPPF 3D	EPPF 4D
Stanford	Total accuracy (%)	64.30	82.01	81.94	83.22
	Mean accuracy (%)	42.48	64.26	66.37	65.11
	Mean recall (%)	40.47	70.88	60.94	72.13
	F1-score	0.395	0.652	0.665	0.672
ScanNet	Total accuracy (%)	63.04	70.39	70.57	72.10
	Mean accuracy (%)	37.50	38.98	44.35	45.70
	Mean recall (%)	19.53	63.52	54.53	56.58
	F1-score	0.209	0.433	0.472	0.488
M40	Total accuracy (%)	87.01	81.64	81.15	82.13
	Mean accuracy (%)	82.08	76.37	75.87	77.05
	Mean recall (%)	83.48	77.30	77.51	76.99
	F1-score	0.824	0.765	0.762	0.769

Retrieval results

Fig. Descriptor and 4D neural network responses for the object table in the ScanNet dataset. Left: descriptor values. Middle: response of the first filter in the first layer. Right: filter response in the second layer. The rows show slices of the fourth dimension. Transparent bins correspond to constant offset values for the response (or 0 for the descriptor values), colored bins - to varying values. The bins are colored so that low values are shown in blue color, while high in red.

References

C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
E. Wahl, U. Hillenbrand, and G. Hirzinger, “Surflet-pair-relation histograms: a statistical 3d-shape representation for rapid classification,” in Proceedings of IEEE International Conference on 3-D Digital Imaging and Modeling (3DIM), 2003, pp. 474–481.

Contact

For any questions or inquiries, please contact Dmytro Bobkov at Email with a subject “Object Descriptor RAL”.

Last updated 24.04.2018

Object descriptor based on point pairs combined with deep learning

Supplementary material for the journal paper "Noise-resistant Deep Learning for Object Classification in 3D Point Clouds Using a Point Pair Descriptor" published in IEEE Robotics and Automation Letters