Abstract
Three-dimensional target detection is a key technology in the fields of autonomous driving and robot control for applications such as self-driving cars and unmanned aircraft systems. In order to achieve high detection accuracy, this paper proposes a 3D target detection network with a coordinate attention training mechanism that generates voting feature points for better detection ability and an overlap region penalty mechanism that reduces false detection. In comparative experiments on public large-scale 3D datasets including the Scannet dataset and SUN-RGB-D dataset, the proposed method obtained an average detection accuracy mAP of 60.1% and 58.0% with an intersection ratio of 0.25, which demonstrates its superior effectiveness over the current main algorithms such as F-PointNet, VoxelNet and MV3D. The improved method is expected to achieve higher accuracy for 3D object detection relying only on point cloud information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ss, A., Svavp, B.: Techniques and challenges of face recognition: a critical review. Procedia Comput. Sci. 143, 536–543 (2018)
Yu, H., Yang, Z., Tan, L., et al.: Methods and datasets on semantic segmentation: a review. Neurocomputing 304, 82–103 (2018)
Shi, S., Wang, X., Li, H.: Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–779 (2019)
Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, 30 (2017)
Qi, C.R., Litany, O., He, K., et al.: Deep hough voting for 3d object detection in point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9277–9286 (2019)
Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Yan, Y., Mao, Y., Li, B.: Second: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)
Lang, A.H., Vora, S., Caesar, H., et al.: Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)
Shi, S., Guo, C., Jiang, L., et al.: Pv-rcnn: point-voxel feature set abstraction for 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10529–10538 (2020)
Chen, X., Ma, H., Wan, J., et al.: Multi-view 3d object detection network for autonomous driving. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
Qi, C.R., Liu, W., Wu, C., et al.: Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 918–927 (2018)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Dai, A., Chang, A.X., Savva, M., et al.: Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5828–5839 (2017)
Song, S., Lichtenberg, S.P., Xiao, J.: Sun rgb-d: A rgb-d scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)
Song, S., Xiao, J.: Deep sliding shapes for amodal 3d object detection in rgb-d images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 808–816 (2016)
Hou, J., Dai, A., Nießner, M.: 3d-sis: 3d semantic instance segmentation of rgb-d scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4421–4430 (2019)
He, K., Gkioxari, G., Dollár, P., et al.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Yi, L., Zhao, W., Wang, H., et al.: Gspn: generative shape proposal network for 3d instance segmentation in point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3947–3956
Acknowledgments
The research was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LQ21A040007), and Scientific Research Fund of Zhejiang Provincial Education Department (Grant No. Y201941856).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, W., Zhu, S., Liu, H., Zhang, P., Zhang, X. (2023). Three-Dimensional Object Detection Network Based on Coordinate Attention and Overlapping Region Penalty Mechanisms. In: Lu, H., Blumenstein, M., Cho, SB., Liu, CL., Yagi, Y., Kamiya, T. (eds) Pattern Recognition. ACPR 2023. Lecture Notes in Computer Science, vol 14408. Springer, Cham. https://doi.org/10.1007/978-3-031-47665-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-47665-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47664-8
Online ISBN: 978-3-031-47665-5
eBook Packages: Computer ScienceComputer Science (R0)