Monocular Dense SLAM with Consistent Deep Depth Prediction

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13002))

Included in the following conference series:

Computer Graphics International Conference

2168 Accesses
1 Citations

Abstract

Monocular simultaneous localization and mapping (SLAM) that using a single moving camera for motion tracking and 3D scene structure reconstruction, is an essential task for many applications, such as vision-based robotic navigation and augmented reality (AR). However, most existing methods can only recover sparse or semi-dense point clouds, which are not adequate for many high-level tasks like obstacle avoidance. Meanwhile, the state-of-the-art methods use multi-view stereo to recover the depth, which is sensitive to the low-textured and non-Lambertian surface. In this work, we propose a novel dense mapping method for monocular SLAM by integrating deep depth prediction. More specifically, a classic feature-based SLAM framework is first used to track camera poses in real-time. Then an unsupervised deep neural network for monocular depth prediction is introduced to estimate dense depth maps for selected keyframes. By incorporating a joint optimization method, predicted depth maps are refined and used to generate local dense submaps. Finally, contiguous submaps are fused with the ego-motion constraint to construct the globally consistent dense map. Extensive experiments on the KITTI dataset demonstrate that the proposed method can remarkably improve the completeness of dense reconstruction in near real-time.

Supported by the National Key Research and Development Program of China under Grant 2018YFB2100601, and National Natural Science Foundation of China under Grant (61872023, 61702482).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

VI-NeRF-SLAM: a real-time visual–inertial SLAM with NeRF mapping

Article 09 February 2024

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

Article 08 September 2018

References

Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)
Article Google Scholar
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J.J.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Rob. 32(6), 1309–1332 (2016)
Article Google Scholar
Concha, A., Civera, J.: Dense piecewise planar tracking and mapping from a monocular sequence. In: Proceedings of the International Conference on Intelligent Robots and Systems (IROS) (2015)
Google Scholar
Deng, X., Zhang, Z., Sintov, A., Huang, J., Bretl, T.: Feature-constrained active visual slam for mobile robot navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7233–7238 (2018)
Google Scholar
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Article Google Scholar
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: European Conference on Computer Vision (ECCV) (2014)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6602–6611 (2017)
Google Scholar
Hermans, A., Floros, G., Leibe, B.: Dense 3d semantic mapping of indoor scenes from rgb-d images. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 2631–2638 (2014)
Google Scholar
Ji, X., Ye, X., Xu, H., Li, H.: Dense reconstruction from monocular slam with fusion of sparse map-points and cnn-inferred depth. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018)
Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234 (2007)
Google Scholar
Liu, H., Zhang, G., Bao, H.: Robust keyframe-based monocular slam for augmented reality. In: 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 1–10 (2016)
Google Scholar
Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Article Google Scholar
Mur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Rob. 33(5), 1255–1262 (2017)
Article Google Scholar
Newcombe, R.A., et al.: Kinectfusion: real-time densesurface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011)
Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: dense tracking and mapping in real-time. In: 2011 International Conference on Computer Vision, pp. 2320–2327 (2011)
Google Scholar
Pan, Y., Xu, X., Ding, X., Huang, S., Wang, Y., Xiong, R.: Gem: online globally consistent dense elevation mapping for unstructured terrain. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)
Google Scholar
Tateno, K., Tombari, F., Laina, I., Navab, N.: Cnn-slam: real-time dense monocular slam with learned depth prediction. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Teixeira, L., Chli, M.: Real-time local 3d reconstruction for aerial inspection using superpixel expansion. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4560–4567 (2017)
Google Scholar
Wang, K., Ding, W., Shen, S.: Quadtree-accelerated real-time monocular dense mapping. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9 (2018)
Google Scholar
Wang, K., Gao, F., Shen, S.: Real-time scalable dense surfel mapping. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6919–6925 (2019)
Google Scholar
Xue, T., Luo, H., Cheng, D., Yuan, Z., Yang, X.: Real-time monocular dense mapping for augmented reality. In: Proceedings of the 25th ACM International Conference on Multimedia, MM 2017, pp. 510–518. Association for Computing Machinery, New York (2017)
Google Scholar
Yin, X., Wang, X., Du, X., Chen, Q.: Scale recovery for monocular visual odometry using depth estimated with deep convolutional neural fields. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5871–5879 (2017)
Google Scholar
Younes, G., Asmar, D.C., Shammas, E.A.: A survey on non-filter-based monocular visual SLAM systems. CoRR abs/1607.00470 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, 100191, China
Feihu Yan, Jiawei Wen & Zhong Zhou
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Zhaoxin Li

Authors

Feihu Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Wen
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhong Zhou .

Editor information

Editors and Affiliations

University of Geneva, Carouge, Switzerland
Nadia Magnenat-Thalmann
University of Minnesota, Minneapolis, MN, USA
Victoria Interrante
EPFL, Lausanne, Switzerland
Daniel Thalmann
University of Crete, Heraklion, Crete, Greece
George Papagiannakis
Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
University of Sydney, Sydney, NSW, Australia
Jinman Kim
University of Calgary, Calgary, AB, Canada
Marina Gavrilova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, F., Wen, J., Li, Z., Zhou, Z. (2021). Monocular Dense SLAM with Consistent Deep Depth Prediction. In: Magnenat-Thalmann, N., et al. Advances in Computer Graphics. CGI 2021. Lecture Notes in Computer Science(), vol 13002. Springer, Cham. https://doi.org/10.1007/978-3-030-89029-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-89029-2_9
Published: 11 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89028-5
Online ISBN: 978-3-030-89029-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Monocular Dense SLAM with Consistent Deep Depth Prediction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

VI-NeRF-SLAM: a real-time visual–inertial SLAM with NeRF mapping

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Monocular Dense SLAM with Consistent Deep Depth Prediction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

VI-NeRF-SLAM: a real-time visual–inertial SLAM with NeRF mapping

Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation