Abstract
Monocular simultaneous localization and mapping (SLAM) that using a single moving camera for motion tracking and 3D scene structure reconstruction, is an essential task for many applications, such as vision-based robotic navigation and augmented reality (AR). However, most existing methods can only recover sparse or semi-dense point clouds, which are not adequate for many high-level tasks like obstacle avoidance. Meanwhile, the state-of-the-art methods use multi-view stereo to recover the depth, which is sensitive to the low-textured and non-Lambertian surface. In this work, we propose a novel dense mapping method for monocular SLAM by integrating deep depth prediction. More specifically, a classic feature-based SLAM framework is first used to track camera poses in real-time. Then an unsupervised deep neural network for monocular depth prediction is introduced to estimate dense depth maps for selected keyframes. By incorporating a joint optimization method, predicted depth maps are refined and used to generate local dense submaps. Finally, contiguous submaps are fused with the ego-motion constraint to construct the globally consistent dense map. Extensive experiments on the KITTI dataset demonstrate that the proposed method can remarkably improve the completeness of dense reconstruction in near real-time.
Supported by the National Key Research and Development Program of China under Grant 2018YFB2100601, and National Natural Science Foundation of China under Grant (61872023, 61702482).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J.J.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Rob. 32(6), 1309–1332 (2016)
Concha, A., Civera, J.: Dense piecewise planar tracking and mapping from a monocular sequence. In: Proceedings of the International Conference on Intelligent Robots and Systems (IROS) (2015)
Deng, X., Zhang, Z., Sintov, A., Huang, J., Bretl, T.: Feature-constrained active visual slam for mobile robot navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7233–7238 (2018)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: European Conference on Computer Vision (ECCV) (2014)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6602–6611 (2017)
Hermans, A., Floros, G., Leibe, B.: Dense 3d semantic mapping of indoor scenes from rgb-d images. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 2631–2638 (2014)
Ji, X., Ye, X., Xu, H., Li, H.: Dense reconstruction from monocular slam with fusion of sparse map-points and cnn-inferred depth. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018)
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234 (2007)
Liu, H., Zhang, G., Bao, H.: Robust keyframe-based monocular slam for augmented reality. In: 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 1–10 (2016)
Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Mur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Rob. 33(5), 1255–1262 (2017)
Newcombe, R.A., et al.: Kinectfusion: real-time densesurface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011)
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: dense tracking and mapping in real-time. In: 2011 International Conference on Computer Vision, pp. 2320–2327 (2011)
Pan, Y., Xu, X., Ding, X., Huang, S., Wang, Y., Xiong, R.: Gem: online globally consistent dense elevation mapping for unstructured terrain. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)
Tateno, K., Tombari, F., Laina, I., Navab, N.: Cnn-slam: real-time dense monocular slam with learned depth prediction. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Teixeira, L., Chli, M.: Real-time local 3d reconstruction for aerial inspection using superpixel expansion. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4560–4567 (2017)
Wang, K., Ding, W., Shen, S.: Quadtree-accelerated real-time monocular dense mapping. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9 (2018)
Wang, K., Gao, F., Shen, S.: Real-time scalable dense surfel mapping. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6919–6925 (2019)
Xue, T., Luo, H., Cheng, D., Yuan, Z., Yang, X.: Real-time monocular dense mapping for augmented reality. In: Proceedings of the 25th ACM International Conference on Multimedia, MM 2017, pp. 510–518. Association for Computing Machinery, New York (2017)
Yin, X., Wang, X., Du, X., Chen, Q.: Scale recovery for monocular visual odometry using depth estimated with deep convolutional neural fields. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5871–5879 (2017)
Younes, G., Asmar, D.C., Shammas, E.A.: A survey on non-filter-based monocular visual SLAM systems. CoRR abs/1607.00470 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Yan, F., Wen, J., Li, Z., Zhou, Z. (2021). Monocular Dense SLAM with Consistent Deep Depth Prediction. In: Magnenat-Thalmann, N., et al. Advances in Computer Graphics. CGI 2021. Lecture Notes in Computer Science(), vol 13002. Springer, Cham. https://doi.org/10.1007/978-3-030-89029-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-89029-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89028-5
Online ISBN: 978-3-030-89029-2
eBook Packages: Computer ScienceComputer Science (R0)